Ubuntu.
-------------------------------------------------------------------------
--WORKING questions---
-------------------------------------------------------------------------
- setup appscale
https://github.com/AppScale/appscale/wiki
- vagrant box for android??
- setup hadoop
The project includes these modules:
Hadoop Common: The common utilities that support the other Hadoop modules.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
相關项目
HBase: 类似Google BigTable的分布式NoSQL列数据库。
Hive:数据仓库工具,由Facebook贡献。
Zookeeper:分布式锁设施,提供类似Google Chubby的功能,由Facebook贡献。
Avro:新的数据序列化格式与传输工具,将逐步取代Hadoop原有的IPC机制。
- Single Node Setup ?
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
一个例子(biuld不到!)http://pages.cs.wisc.edu/~gibson/mapReduceTutorial.html
/usr/bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount input output
- Running wordcount sample using MRV1 on CDH4.0.1 VM [Solved]?
CDH4 (Cloudera's Distribution, including Apache Hadoop)
The world's leading Apache Hadoop distribution.
downloaded the VM from https://downloads.cloudera.com/demo_vm/vmware/cloudera-demo-vm-cdh4.0.0-vmware.tar.gz
username: cloudera
password: cloudera
The cloudera account has sudo privileges in the VM.
You can access status through the browser at the following URLs:
NameNode status (localhost:50070)
JobTracker status (localhost:50030)
The Hue user interface (localhost:8088)
The HBase web UI (localhost:60010)
CDH4 Installation
https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation
CDH4 introduces a new version of MapReduce: MapReduce 2.0 (MRv2) built on the YARN framework
CDH4 also provides an implementation of the previous version of MapReduce, now referred to as MRv1
The fundamental idea of MRv2's YARN architecture is to split up the two primary responsibilities of the JobTracker — resource management and job scheduling/monitoring — into separate daemons: a global ResourceManager (RM) and per-application ApplicationMasters (AM).
Ways To Install CDH4
Automated method using Cloudera Manager Free Edition;
Manual methods described below:
CDH3 CDH4
JobTracker -》 ResourceManager
slave nodes: TaskTracker -》NodeManagers
- hadoop Cluster Setup?
-------------------------------------------------------------------------
--FINISHED questions---
-------------------------------------------------------------------------
-修改分辨率
xrandr --newmode "1920x1080_60.00" 173.00 1920 2048 2248 2576 1080 1083 1088 1120 -hsync +vsync
xrandr --addmode VGA-1 1920x1080_60.00
-关于appscale
AppScale
is an open-source framework for running Google App Engine applications.
It is an implementation of a cloud computing platform (Platform-as-a-Service), supporting Xen, KVM, Amazon EC2 and Eucalyptus.
It has been developed and is maintained by AppScale Systems.[1][2][3][4][5]
AppScale allows users to upload multiple App Engine applications to a cloud.
It supports multiple distributed backends such as HBase, Hypertable, Apache Cassandra, Voldemort (distributed data store), MySQL Cluster, and Redis.
It has support for Python, Go, and Java applications, taking the open source SDK provided by Google App Engine and implementing scalable services such as the datastore, memcache, blobstore, users API, and channel API.
-vagrant
uses Oracle’s VirtualBox to build configurable, lightweight, and portable virtual machines dynamically
Install Vagrant
download http://files.vagrantup.com/packages/87613ec9392d4660ffcb1d5755307136c06af08c/vagrant_i686.deb, 系统会帮助安装
First Vagrant Virtual Environment
$ vagrant box add lucid32 http://files.vagrantup.com/lucid32.box
$ vagrant init lucid32
$ vagrant up
Why Vagrant?
The Vagrantfile
- Cloud computing
軟體即服務 (SaaS): google docs
Examples of SaaS include: Google Apps, Microsoft Office 365, Onlive, GT Nexus, Marketo, and TradeCard.
平台即服務 (PaaS): 消費者掌控運作應用程式的環境(也擁有主機部分掌控權),但並不掌控作業系統、硬體或運作的網絡基礎架構。平台通常是應用程式基礎架構。例如: Google App Engine。
Examples of PaaS include: AWS Elastic Beanstalk, Cloud Foundry, Heroku, Force.com, EngineYard, Mendix, OpenShift, Google App Engine, Windows Azure Cloud Services and OrangeScape.
基礎架構即服務 (IaaS): 消費者能掌控作業系統、儲存空間、已部署的應用程式及網絡元件(如防火牆、負載平衡器等),但並不掌控雲端基礎架構。例如: Amazon AWS、Rackspace。
Examples of IaaS providers include: Amazon EC2, Azure Services Platform, DynDNS, Google Compute Engine, HP Cloud, iland, Joyent, LeaseWeb, Linode, NaviSite, Oracle Infrastructure as a Service, Rackspace Cloud, ReadySpace Cloud Services, ReliaCloud, SAVVIS, SingleHop, and Terremark
Network as a service (NaaS) Traditional NaaS services include flexible and extended VPN
Open source
examples being the Hadoop framework[98] and VMware's Cloud Foundry
Open standards
OpenStack
开源云列表
DRBD®
OpenNebula
Eucalyptus
Nimbus - cloud computing for science
CloudStack - The Apache Software Foundation!
Deltacloud API - The Apache Software Foundation!
Cloud Foundry by VMware
OpenStack
- memcached裝好之後,基本的啟動方式是
* `memcached -l 127.0.0.1 -P 11211 -m 128 -d` for deamon
* `memcached -l 127.0.0.1 -P 11211 -m 128 -vv` for development debug
- 如何测试memcached是否能用了?
memcached 是可以直接 telnet 127.0.0.1 11211 的。
telnet 之後打 stats 可以得到一些統計資料,除了目前共有多少筆資料跟共用多少空間之外,重要的有 cmd_get 跟 cmd_hits,就可以得出 cache hit ratio,這個數字應該努力到九成以上。另外還有你的 cmd_set 應該超過 cmd_get,
- memcached如何使用?
$ telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
set foo 0 0 3 (保存命令)
bar (数据)
STORED (结果)
get foo (取得命令)
VALUE foo 0 3 (数据)
bar (数据)
协议文档位于memcached的源代码内,也可以参考以下的URL。
http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt
- python的例子
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
mc.set("some_key", "Some value")
value = mc.get("some_key")
mc.set("another_key", 3)
mc.delete("another_key")
- linux check if memcached running
administrator@ubuntu:~/下载/appscale-1.6.8$ sudo netstat -ap | grep 11211
tcp 0 0 localhost:11211 *:* LISTEN 11643/memcached
udp 0 0 localhost:11211 *:* 11643/memcached
-基于libevent的事件处理
libevent是个程序库,它将Linux的epoll、BSD类操作系统的kqueue等事件处理功能封装成统一的接口。即使对服务器的连接数增加,也能发挥O(1)的性能。 memcached使用这个libevent库,因此能在Linux、BSD、Solaris等操作系统上发挥其高性能。关于事件处理这里就不再详细介绍,可以参考Dan Kegel的The C10K Problem。
libevent: http://www.monkey.org/~provos/libevent/
The C10K Problem: http://www.kegel.com/c10k.html
-memcached的分布式怎么用?
(1)客户端做路由
客户端做路由的原理非常简单,应用服务器在每次存取某key的value时,通过某种算法把key映射到某台memcached服务器nodeA上,因此这个key所有操作都在nodeA上
spymemcached是一个用得比较广的java客户端,它就提供了一种简单的hash算法,实现类为ArrayModNodeLocator
,从key映射到node的源码如下:
public MemcachedNode getPrimary(String k) {
return nodes[getServerForKey(k)];
}
private int getServerForKey(String key) {
int rv=(int)(hashAlg.hash(key) % nodes.length);
assert rv >= 0 : "Returned negative key for key " + key;
assert rv < nodes.length
: "Invalid server number " + rv + " for key " + key;
return rv;
}
(2)服务端集群??
- 在增减服务器的时候,会导致大范围的缓存丢失如何解决?
请先看Consistent hashing算法,中文的介绍可以参考这里,通过存取时选定服务器算法的改变,来实现。
http://en.wikipedia.org/wiki/Consistent_hashing
一致哈希 是一种特殊的哈希算法。在使用一致哈希算法后,哈希表槽位数(大小)的改变平均只需要对K/n 个关键字重新映射,其中 K是关键字的数量,n是槽位数量。然而在传统的哈希表中,添加或删除一个槽位的几乎需要对所有关键字进行重新映射。
一致哈希由MIT的Karger及其合作者提出,现在这一思想已经扩展到其它领域。在这篇1997年发表的学术论文中介绍了“一致哈希”如何应用于用户易变的分布式Web服务中。哈希表中的每一个代表分布式系统中一个节点,在系统添加或删除节点只需要移动K/n项。[1]
一致哈希的概念还被应用于分布式散列表(DHT)的设计。DHT使用一致哈希来划分分布式系统的节点。所有关键字都可以通过一个连接所有节点的覆盖网络高效地定位到某个节点。
在亚马逊的云存储系统Dynamo的数据划分功能模块中使用一致哈希。[3]
需求
在使用n台缓存服务器时,一种常用的负载均衡方式是,对资源o的请求使用hash(o) = o mod n来映射到某一台缓存服务器。当增加或减少一台缓存服务器时这种方式可能会改变所有资源对应的hash值,也就是所有的缓存都失效了,这会使得缓存服务器大量集中地向原始内容服务器更新缓存。因些需要一致哈希算法来避免这样的问题。 一致哈希尽可能使同一个资源映射到同一台缓存服务器。这种方式要求增加一台缓存服务器时,新的服务器尽量分担存储其他所有服务器的缓存资源。减少一台缓存服务器时,其他所有服务器也可以尽量分担存储它的缓存资源。 一致哈希算法的主要思想是将每个缓存服务器与一个或多个哈希值域区间关联起来,其中区间边界通过计算缓存服务器对应的哈希值来决定。(定义区间的哈希函数不一定和计算缓存服务器哈希值的函数相同,但是两个函数的返回值的范围需要匹配。)如果一个缓存服务器被移除,则它会从对应的区间会被并入到邻近的区间,其他的缓存服务器不需要任何改变。
实现
一致哈希将每个对象映射到圆环边上的一个点,系统再将可用的节点机器映射到圆环的不同位置。查找某个对象对应的机器时,需要用一致哈希算法计算得到对象对应圆环边上位置,沿着圆环边上查找直到遇到某个节点机器,这台机器即为对象应该保存的位置。 当删除一台节点机器时,这台机器上保存的所有对象都要移动到下一台机器。添加一台机器到圆环边上某个点时,这个点的下一台机器需要将这个节点前对应的对象移动到新机器上。 更改对象在节点机器上的分布可以通过调整节点机器的位置来实现。
- Cloud computing comparison
http://en.wikipedia.org/wiki/Cloud_computing_comparison
- administrator@ubuntu:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04 LTS
Release: 12.04
Codename: precise
- administrator@ubuntu:~$ uname -a
Linux ubuntu 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
- apt-get search packages
list installed packages
apt-cache search {package-name}
apt-cache search apache
apt-get install {package-name}
apt-get remove {package-name}
dpkg --info {.deb-package-name}
List all installed packages
The syntax is:
dpkg -l
Find, what package owns the file /bin/netstat?
The syntax is:
dpkg -S {/path/to/file}
List ach dependency a package has...
apt-cache depends package
aptitude: It is a text-based interface to the Debian GNU/Linux package system.
synaptic: GUI front end for APT
APT and Dpkg Quick Reference Sheet
http://www.cyberciti.biz/ref/apt-dpkg-ref.html
-apt-get /etc/apt/sources.list format
cd /mnt ; mkdir d; mount -t vfat /dev/hda2 /mnt/d
可在sources.list中添加如下
deb file:///mnt/debian1/ woody main contrib
deb file:///mnt/debian1/ woody/non-US main contrib
这里是有规律的,你可先查一下debian1下的目录结构,其中有个目录dist是默认添加的,woody是dist目录下的一个最新的stable版本,还有其他的老的或testing以及unstable版,main和contrib是woody下的两个目录(有些还包括一个叫non-free的目录),non-US是非美国的,牵扯到一些由于版权以及加密的软件,它下面同样有main contrib(有几个目录都选上)。
-mount iso文件
mount -t iso9660 -o loop youisofile1.iso /dev/iso1
-linux how to find what processes have written files?
The lsof command (already mentioned in several answers) will tell you what process has a file open at the time you run it. lsof is available for just about every unix variant.
lsof /path/to/file
- apt-get install to specific directory?
做不到
if build with source...
./configure
make
make install DESTDIR=/mydir
- appscale manual
Google App Engine APIs
Blobstore enables users to store large entities of text or binary data
Channel, allows applications to push messages from the application server to a client’s browser.
Datastore, allows for the persistence of data. PUT, GET, DELETE, QUERY. The Google Query Language (GQL) it lacks relational operations such as JOIN and MERGE. implementation of transactions
Images, generate thumbnails, rotation, composition, convert formats, and cropping images
Memcache, permits applications to store their frequently used data in a distributed memory grid
Namespaces, implements the ability to segregate data into different namespaces
TaskQueue,
Users, provides authentication for web applications through the use of cookies
URL Fetch, enables an application the ability to do POST and GET requests on remote resources.
XMPP receive and send messages to users with a valid XMPP account
http://code.google.com/appengine/docs
Other AppScale APIs
MapReduce Streaming API, supports such computation via Hadoop Streaming (http://wiki.apache.
org/hadoop/HadoopStreaming).
putMRInput(data, inputLoc):runMRJob(mapper, reducer, inputLoc, outputLoc, config)
getMROutput(outputLoc):writeTempFile(suffix, data): ,........
EC2 API
- install oracle jdk?
sudo update-alternatives
在 Ubuntu 10.x 的時候,Sun JDK 有被放到 Ubuntu 官方的 PPA 裡,所以,用阿舍寫的這一篇的方法就可以安裝好 Sun JDK,但是,到了Ubuntu 11.x 之後,Sun JDK 就不再被放到 Ubuntu 官方的 PPA 了
1.下載
2. 變更檔案模式sudo chmod u+x jre-6u31-linux-i586.bin
3. 安裝./jre-6u31-linux-i586.bin
執行後,就會開始解壓縮並且產生一個新的資料夾,名稱就是JDK的版本,接著,請把這個資料夾整個搬到「/usr/lib/jvm」資料夾,
sudo mkdir /usr/lib/jvm
sudo mv jdk1.6.0_32 /usr/lib/jvm
搬好資料夾之後,請執行下面這二行指令來設定連結,執行前,請記得先將資料夾換成下載的JDK版本,像阿舍的是 1.6.0_32。
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.6.0_32/bin/java" 1
sudo update-alternatives --install "/usr/lib/mozilla/plugins/libjavaplugin.so" "mozilla-javaplugin.so" "/usr/lib/jvm/jdk1.6.0_32/jre/lib/i386/libnpjp2.so" 1
4. 切換 JVM
上面二行指令都執行無誤後,如果像阿舍之前沒有裝過任何 JVM 的話,那麼安裝到這裡就算完成了,不然,需要執行下面這二行指令來將 JVM 切換成 Sun JDK 的哩 !
sudo update-alternatives --config java
sudo update-alternatives --config mozilla-javaplugin.so
5. 測試
上面的步驟都搞定後,如果有安裝成功的話,用下面的指令就可以看到 Sun JDK 的版本資訊了...
java -version
參考資料
https://help.ubuntu.com/community/Java
http://stackoverflow.com/questions/3747789/how-to-install-the-sun-java-jdk-on-ubuntu-10-10-maverick-meerkat
- install vmware player bundle
download .bundle from vmware site open terminal then
cd /home/yourusername/Desktop
then
if your vmware version is deferent name change it to that then do this
sudo sh VMware-Workstation-6.5.0-118166.i386.bundle
or
sudo sh VMware-Player-2.5.1-126130.i386.bundle
enjoy
- 上一篇 My router.
- 下一篇 Huawei e1750 asterisk.