试用了下Gitorious,是github enterprise的很好的替代品,大部分操作和github基本一致,github以人的repo为主,Gitorious更强调project以及team,作为内部的源码管理平台实在很合适。
推荐使用bitnami的installer来安装
http://bitnami.com/stack/gitorious/
安装很简单,唯一需要注意的是,必须设置一个domain,用ip不行
如果想换一下域名,在配置“/opt/gitorious-2.4.12-1/apps/gitorious/htdocs/config/gitorious.yml”里面替换就行了
另外本地hosts文件配置一下域名的解析,这步应该是可选的。
这下gitolite可以功成身退了。
本文来自: Gitorious
最近发现vps上面跑的用来收邮件的python脚本占用了30%的cpu,并且一直就有写个邮件river的想法,不过一直没有付诸行动,今天下班抽空完成了这个插件,理论支持的协议:
/**
now support:
imap
imaps
pop3s
pop3
*/
不过只有时间测试了pop3协议,正常收取。
地址:https://github.com/medcl/elasticsearch-river-email
创建river的方式:
$ curl -XPUT 'localhost:9200/_river/google/_meta' -d '{ "type": "email", "email": { "config" : [ { "host": "pop.exmail.qq.com", "port": 110, "type":"pop3", "username":"river@infinitbyte.com", "password":"ail?sid=9UL", "check_interval": 5000, "skip_count": 1, } ] }, "index":{ "index":"google", "type":"gmail" } }'
RTF已经包含该插件,并测试通过:
https://github.com/medcl/elasticsearch-rtf/tree/master/elasticsearch/plugins/river-email
elasticsearch的mongodb-river没有提供对一个库重新同步数据的方法,在很多情况下我们需要这么做,比如修改了elasticsearch的mapping,这个时候,就只能重建数据,所以需要重新从mongodb里面pull数据,然后重建索引,怎么办?
其实我们只需要清除mongodb-river记录的同步信息就行了,然后mongodb就能自动重新初始化,就跟新安装的一样。
1.第一步,查看那些信息需要删除,所有的信息都在_river索引里面
curl -XGET http://192.168.2.99:9200/_river/_search?q=*
返回结果,类似这样的,就是记录数据同步的位置信息了
{ "_index": "_river", "_type": "mongodb", "_id": "testmongo.person", "_score": 1, "_source": { "mongodb": { "_last_ts": "{ \"$ts\" : 1363082244 , \"$inc\" : 1}" } } },
怎么处理呢?干掉就行,这个记录其实也就是一条elasticsearch的索引文档数据,找到index,type,id删除就行了。
我这里全部删除了,你可别照着来
curl -XDELETE http://192.168.2.99:9200/_river/_query?q=*
第二步,目标索引如果需要修改mapping,删除数据,等等
第三步,重新创建river配置信息,啥,没有备份,慢慢哭去吧
到这里,数据应该就可以马上看到了,速度非常快。
本文来自: mongodb-river重新同步数据
地址:https://github.com/medcl/jubatus-classifier
修改自官方的例子,将一些参数提取出来了。
简单介绍一下怎么使用,
第一步,启动服务,参照前面两篇即可:
Jubatus单机测试
Jubatus集群测试
配置文件:config.json
本文来自: 发布个jubatus-classifier脚本
http://jubat.us/en/tutorial_distributed.html
#安装,运行zookeeper wget http://mirror.bjtu.edu.cn/apache/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz 1055 tar vxzf zookeeper-3.4.5.tar.gz 1056 cd zookeeper-3.4.5 [root@ghost-rider zookeeper-3.4.5]# cp conf/zoo_sample.cfg conf/zoo.cfg [root@ghost-rider zookeeper-3.4.5]# bin/zkServer.sh start JMX enabled by default Using config: /root/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
#往zookeeper上注册配置文件 jubaconfig --cmd write --zookeeper=localhost:2181 --file config.json --name tutorial --type classifier
#启动Jubatus Keeper,带上zookeeper地址 [root@ghost-rider jubatus-tutorial-python]# jubaclassifier_keeper --zookeeper=localhost:2181 --rpc-port=9198 I0320 18:29:25.938608 16930 server_util.cpp:333] starting jubaclassifier_keeper 0.4.2 RPC server at 192.168.2.100:9198 pid : 16930 user : root timeout : 10 thread : 16 logdir : loglevel : INFO(0) zookeeper : localhost:2181 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@716: Client environment:host.name=ghost-rider 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@724: Client environment:os.arch=2.6.32-71.el6.x86_64 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Fri May 20 03:51:51 BST 2011 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@733: Client environment:user.name=root 2013-03-20 18:29:25,938:16930(0x7f3616cd7720):ZOO_INFO@log_env@741: Client environment:user.home=/root 2013-03-20 18:29:25,939:16930(0x7f3616cd7720):ZOO_INFO@log_env@753: Client environment:user.dir=/root/jubatus/jubatus-tutorial-python 2013-03-20 18:29:25,939:16930(0x7f3616cd7720):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=(nil) sessionId=0 sessionPasswd=<null> context=(nil) flags=0 2013-03-20 18:29:25,939:16930(0x7f3616ac7700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:2181] 2013-03-20 18:29:25,947:16930(0x7f3616ac7700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:2181], sessionId=0x13d8757e0900002, negotiated timeout=10000 I0320 18:29:25.960101 16930 keeper.cpp:53] start listening at port 9198 I0320 18:29:25.981381 16930 membership.cpp:128] keeper created: /jubatus/jubakeepers/classifier/192.168.2.100_9198 I0320 18:29:25.981425 16930 keeper.cpp:58] registered group membership I0320 18:29:25.981451 16930 keeper.cpp:60] jubaclassifier_keeper RPC server startup
#可启动多个分类器实例,并带上自定义名称,为测试速度,先启动一个试试 $ jubaclassifier --rpc-port=9180 --name=tutorial --zookeeper=localhost:2181 & $ jubaclassifier --rpc-port=9181 --name=tutorial --zookeeper=localhost:2181 & $ jubaclassifier --rpc-port=9182 --name=tutorial --zookeeper=localhost:2181 &
#在zookeeper里面查看注册上的节点 [root@ghost-rider zookeeper-3.4.5]# bin/zkCli.sh -server localhost:2181 Connecting to localhost:2181 2013-03-20 18:32:30,589 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 2013-03-20 18:32:30,593 [myid:] - INFO [main:Environment@100] - Client environment:host.name=ghost-rider 2013-03-20 18:32:30,594 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.6.0_33 2013-03-20 18:32:30,594 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Sun Microsystems Inc. 2013-03-20 18:32:30,595 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/jdk/jre 2013-03-20 18:32:30,595 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/root/zookeeper-3.4.5/bin/../build/classes:/root/zookeeper-3.4.5/bin/../build/lib/*.jar:/root/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/root/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/root/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/root/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/root/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/root/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/root/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/root/zookeeper-3.4.5/bin/../conf:.:/usr/local/jdk//jre/lib/rt.jar:/usr/local/jdk//lib/dt.jar:/usr/local/jdk//lib/tools.jar 2013-03-20 18:32:30,596 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/local/jdk/jre/lib/amd64/server:/usr/local/jdk/jre/lib/amd64:/usr/local/jdk/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2013-03-20 18:32:30,596 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2013-03-20 18:32:30,596 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA> 2013-03-20 18:32:30,597 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux 2013-03-20 18:32:30,597 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64 2013-03-20 18:32:30,598 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.32-71.el6.x86_64 2013-03-20 18:32:30,598 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root 2013-03-20 18:32:30,599 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root 2013-03-20 18:32:30,600 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/root/zookeeper-3.4.5 2013-03-20 18:32:30,602 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@2e0ece65 Welcome to ZooKeeper! JLine support is enabled 2013-03-20 18:32:30,646 [myid:] - INFO [main-SendThread(ghost-rider:2181):ClientCnxn$SendThread@966] - Opening socket connection to server ghost-rider/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-03-20 18:32:30,652 [myid:] - INFO [main-SendThread(ghost-rider:2181):ClientCnxn$SendThread@849] - Socket connection established to ghost-rider/127.0.0.1:2181, initiating session 2013-03-20 18:32:30,662 [myid:] - INFO [main-SendThread(ghost-rider:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server ghost-rider/127.0.0.1:2181, sessionid = 0x13d8757e0900003, negotiated timeout = 30000 [zk: localhost:2181(CONNECTED) 0] WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /jubatus/actors/classifier/tutorial/nodes Node does not exist: /jubatus/actors/classifier/tutorial/nodes [zk: localhost:2181(CONNECTED) 1] ls /jubatus/actors/classifier/tutorial/nodes [192.168.2.100_9180] [zk: localhost:2181(CONNECTED) 2]
#执行训练和预测客户端,端口指向Jubatus Keeper的端口,并指定分类器名称 $ python tutorial.py --server_port=9198 --name=tutorial
随着往训练数据的增加,正确率直线上升,牛逼啊,一边训练,一边还能继续进行预测,互不影响。
本文来自: Jubatus集群测试
https://github.com/jubatus/
http://jubat.us/en/tutorial.html 照着这个教程简单在单机上试用了一下,待继续研究
阅读这篇文章的其余部分
本文来自: Jubatus单机测试
#下载编译好的版本
wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.3.tgz tar vxzf mongodb-linux-x86_64-2.2.3.tgz cd mongodb-linux-x86_64-2.2.3
1,T检验和F检验的由来
一般而言,为了确定从样本(sample)统计结果推论至总体时所犯错的概率,我们会利用统计学家所开发的一些统计方法,进行统计检定。
