A short philosophic story with different versions

Story One

A. a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. Thereafter, the dying man found that the farmer is as poor as a church mouse. He was moved with tears in his eyes, and got down on his kneels for a long time to express gratitude. In the rest of his life, he keeps helping other in need.

B. a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. However, the man found that the farmer is so rich and have lots of delicious food in the living room. Instead of expressing gratitude, he felt angry since he think he was treated badly. Thereafter, he grabed a knife and killed the farmer.

Thought1: Why the same action gets different results? ------ see what you get, you will be graceful; see what you donot get, you will remember the animosity.

Thought2: To be graceful or angry, is always determined by how you do, not what you do. Essentially, the dying man should express his gratitude, in both circumstance A and B. However, only a rational man can do this in reality.

Thought3: Apart from being graceful or angry, does there a third status of mentality?

Two

A. a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. Later, the people found the saved man is a corrupted official wanted by law. So many people hate the farmer, because he help the bad guy.

B. a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. Later, the people found the saved man is very kind and help a lot of poor people. So, everybody praise the farmer for his great kindness.

Thought1: if we show charity to people according to who he is, not who is in need, everybody will have his own standards. Soon after, nobody wants to do charity. If we show charity to the wrong guy, we will be punished.

Thought2: If the people praise the farmer in both cases, the world will be full of love. Otherwise, we will be surrounded by cautions and snobbishness.

Three

A. a man who is dying of hungry staggered to a farmer's house. This farmer pointed to another farmer's home and said : "look, that family used to give you the food, come to them please". When the starving man staggered to that farmers home and begged for food, the farmer said regretfully that "we donot have food for you". Later, everybody in the town criticize the farmer used to help the starving man, but not one criticizing the first farmer.

B. a man who is dying of hungry staggered to a farmer's house. This farmer pointed to another farmer's home and said : "look, that family used to give you the food, come to them please". When the starving man staggered to that farmers home and begged for food, the farmer said this time that "we only have half a steamed bun for you". Later, everybody in the town laughed at the farmer for his parsimony.

Thought1: For a man used to be full of love in his heart, people always get a higher expectation for him.

Thought2: For a man who is selfish, since we get a low expectation for home, we will not be disappointed.

Thought3: This kind of expectation always expels one's charity.

Four

A: a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. Thereafter, the dying man found that the farmer is as poor as a church mouse. He was moved with tears in his eyes, and got down on his kneels for a long time to express gratitude. In the rest of his life, he keeps helping other in need. In addition, he support every decision made by the farmer no matter it is right or wrong. This make the farmer arrogant and then bankrupt because of making a wrong decision.

B: a man who is dying of hungry staggered to a farmer's house. The farmer gave him a steamed bun and some water, and then the dying man survived. Thereafter, the dying man found that the farmer is as poor as a church mouse. He was moved with tears in his eyes, and got down on his kneels for a long time to express gratitude. In the rest of his life, he keeps helping others in need. However, he also tell the farmer the right thing to do, instead of buttering up the farmer for to repay the gratitude.

Thought1: Different way to repay the gratitude get different results

Thought2: you should keep thinking in the right way all the time. Telling the truth and criticizing are alway the best means.

Posted in MISC | Tagged , | Leave a comment

What's Google doing in search?

1. Interesting highlighting in search results or snippets.

2. synonym expansion -- query expansion

3. Social Search in Google labs

4. Google Squared

extract interesting facts from WEB page, and present them in meaningful way to you

Posted in Information Retrieval | Tagged , | Leave a comment

Java os-level lock

Once the application scales to multiple nodes, file-level locking is required. A quick search introduces Java’s FileChannel and FileLock classes. The documentation implies that platform independent OS-level locking is achieved using fileChannel.lock(). File channels are safe for use by multiple concurrent threads

Notes:

However, a quick test on my Ubuntu host shows that I can vi from a terminal and edit a file that’s locked by the Java process. However, it works perfectly when other Java processes try to acquire the lock, which in our case was all we needed.

Difference between fileChannel.lock() and fileChannel.tryLock()

tryLock() will immediately return a lock, or just null if the lock can not be acquired since another program holds an overlapping lock. This method does not block.

In the contrast, an invocation of lock() method will block until the file can be locked.

Lock while reading

You may want to lock a file and read it in order to prevent it from modification by another process, using RandomAccessFile(file, "rw").getchannel(), not FileInputStream.getchannel().

Because it will throw an NonWritableChannelException, when you are trying to lock an read-only channel.

Code snippets:

RandomAccessFile rfile = null;

FileInputStream fis = null;

BufferedInputStream bufIn = null;

ObjectInputStream objIn = null;

FileLock filelock = null;

try {

rfile = new RandomAccessFile(file, "rw");

FileChannel channel = rfile.getChannel();

filelock = channel.lock();

fis = new FileInputStream(file);

byte[] buf = new byte[1024*1024*3];

ByteArrayOutputStream baos = new ByteArrayOutputStream();

int len =-1;

while( (len = fis.read(buf)) != -1){

baos.write(buf, 0, len);

}

Streams.closeInputStream(fis);

System.out.println("finish reading cache file");

ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());

bufIn = new BufferedInputStream(bais);

}

$latex \alpha$ version

Posted in Java | Tagged , | Leave a comment

vi/vim command summary



vi/vim command summary

The following tables contain all the basic vi commands.
Starting vi

Command Description
vi file start at line 1 of file
vi +n file start at line n of file
vi + file start at last line of file
vi +/pattern file start at pattern in file
vi -r file recover file after a system crash

Saving files and quitting vi

Command Description
:e file edit file (save current file with :w first)
:w save (write out) the file being edited
:w file save as file
:w! file save as an existing file
:q quit vi
:wq save the file and quit vi
:x save the file if it has changed and quit vi
:q! quit vi without saving changes

Moving the cursor

Keys pressed Effect
h left one character
l or <Space> right one character
k up one line
j or <Enter> down one line
b left one word
w right one word
( start of sentence
) end of sentence
{ start of paragraph
} end of paragraph
1G top of file
nG line n
G end of file
<Ctrl>W first character of insertion
<Ctrl>U up ½ screen
<Ctrl>D down ½ screen
<Ctrl>B up one screen
<Ctrl>F down one screen

Inserting text

Keys pressed Text inserted
a after the cursor
A after last character on the line
i before the cursor
I before first character on the line
o open line below current line
O open line above current line

Changing and replacing text

Keys pressed Text changed or replaced
cw word
3cw three words
cc current line
5cc five lines
r current character only
R current character and those to its right
s current character
S current line
~ switch between lowercase and uppercase

Deleting text

Keys pressed Text deleted
x character under cursor
12x 12 characters
X character to left of cursor
dw word
3dw three words
d0 to beginning of line
d$ to end of line
dd current line
5dd five lines
d{ to beginning of paragraph
d} to end of paragraph
:1,. d to beginning of file
:.,$ d to end of file
:1,$ d whole file

Using markers and buffers

Command Description
mf set marker named ``f''
`f go to marker ``f''
´f go to start of line containing marker ``f''
"s12yy copy 12 lines into buffer ``s''
"ty} copy text from cursor to end of paragraph into buffer ``t''
"ly1G copy text from cursor to top of file into buffer ``l''
"kd`f cut text from cursor up to marker ``f'' into buffer ``k''
"kp paste buffer ``k'' into text

Searching for text

Search Finds
/and next occurrence of ``and'', for example, ``and'', ``stand'', ``grand''
?and previous occurrence of ``and''
/^The next line that starts with ``The'', for example, ``The'', ``Then'', ``There''
/^The\> next line that starts with the word ``The''
/end$ next line that ends with ``end''
/[bB]ox next occurrence of ``box'' or ``Box''
n repeat the most recent search, in the same direction
N repeat the most recent search, in the opposite direction

Searching for and replacing text

Command Description
:s/pear/peach/g replace all occurrences of ``pear'' with ``peach'' on current line
:/orange/s//lemon/g change all occurrences of ``orange'' into ``lemon'' on next line containing ``orange''
:.,$/\<file/directory/g replace all words starting with ``file'' by ``directory'' on every line from current line onward, for example, ``filename'' becomes ``directoryname''
:g/one/s//1/g replace every occurrence of ``one'' with 1, for example, ``oneself'' becomes ``1self'', ``someone'' becomes ``some1''

Matching patterns of text

Expression Matches
. any single character
* zero or more of the previous expression
.* zero or more arbitrary characters
\< beginning of a word
\> end of a word
\ quote a special character
\* the character ``*''
^ beginning of a line
$ end of a line
[set] one character from a set of characters
[XYZ] one of the characters ``X'', ``Y'', or ``Z''
[[:upper:]][[:lower:]]* one uppercase character followed by any number of lowercase characters
[^set] one character not from a set of characters
[^XYZ[:digit:]] any character except ``X'', ``Y'', ``Z'', or a numeric digit

Options to the :set command

Option Effect
all list settings of all options
ignorecase ignore case in searches
list display <Tab> and end-of-line characters
mesg display messages sent to your terminal
nowrapscan prevent searches from wrapping round the end or beginning of a file
number display line numbers
report=5 warn if five or more lines are changed by command
term=ansi set terminal type to ``ansi''
terse shorten error messages
warn display ``[No write since last change]'' on shell escape if file has not been saved

Posted in Linux | Tagged , , | 1 Comment

Download Whole Website or Directories by using wget in Linux

Download Whole Website or Directories by using wget in Linux

You might have googled a software for downloading a specified website or directory on either Windows or Linux platform . Yes, a bunch of tools can do this for you. Actually, we can do this by using a simple command, wget, on Linux platform. It is highly customizable, just a powerful crawler. You will find it fantastic and really cool. Let me just show you how!

wget \

--recursive \

--no-clobber \

--page-requisites \

--html-extension \

--convert-links \

--restrict-file-names=windows \

--domains techstroke.com \

--no-parent \

www.techstroke.com/Windows/

The command above let you download the "windows" directory at the domain of "techstroke.com" recursively, starting from the url  www.techstroke.com/Windows/

How do you like it? Hah, really cool?

Finally, let me explain a bit more about the parameters. Of course, you can refer to its documentation.

The options are:

–recursive: download the entire Web site.

–domains-techstroke.com: don’t follow links outside techstroke.com.

–no-parent: don’t follow links outside the directory /Windows/.

–page-requisites: get all the elements that compose the page (images, CSS and so on).

–html-extension: save files with the .html extension.

–convert-links: convert links so that they work locally, off-line.

–restrict-file-names=windows: modify filenames so that they will work in Windows as well.

–no-clobber: don’t overwrite any existing files (used in case the download is interrupted and

resumed).

Posted in Linux | Tagged , , , | Leave a comment

What_is_“Bayesian”_Statistical_Inference?

 
 

Sent to you by Jeffye via Google Reader:

 
 
via LingPipe Blog by lingpipe on 9/9/09

Bayesian Inference is Based on Probability Models

Bayesian models provide full probability distributions over both observable data and unobservable model parameters. Bayesian statistical inference is carried out using standard probability theory.

What’s a Prior?

The full Bayesian probability model includes the unobserved parameters. The marginal distribution over parameters is known as the “prior” parameter distribution, as it may be computed without reference to observable data. The conditional distribution over parameters given observed data is known as the “posterior” parameter distribution.

Non-Bayesian Statistics

Non-Bayesian statisticians eschew probability models of unobservable model parameters. Without such models, non-Bayesians cannot perform probabilistic inferences available to Bayesians, such as definining the probability that a model parameter (such as the mean height of an adult male American) is in a defined range say (say 5′6″ to 6′0″).

Instead of modeling the posterior probabilities of parameters, non-Bayesians perform hypothesis testing and compute confidence intervals, the subtleties of interpretation of which have confused introductory statistics students for decades.

Bayesian Technical Apparatus

The sampling distribution models the probability of observable data given unobservable model parameters .

The prior distribution models the probability of the parameters .

The full joint distribution over parameters and data is computed with the chain rule, .

The posterior distribution of the parameters given the observed data is derived from the sampling and prior distributions via Bayes’s rule,

The posterior predictive distribution for new data given observed data is the average of the sampling distribution over parameters proportional to their posterior probability,

The key feature is the incorporation into predictive inference of the uncertainty in the posterior parameter estimate. In particular, the posterior is an overdispersed variant of the sampling distribution. The extra dispersion arises by integrating over the posterior.

Conjugate Priors

Conjugate priors, where the prior and posterior are drawn from the same family of distributions, are convenient but not necessary. For instance, if the sampling distribution is binomial, a beta-distributed prior leads to a beta-distributed posterior. With a beta posterior and binomial sampling distribuiton, the predictive posterior distribution is beta-binomial, the overdispersed form of the binomial. If the sampling distribution is Poisson, a gamma-distributed prior leads to a gamma-distributed posterior; the predictive posterior distribution is negative-binomial, the overdispersed form of the Poisson.

Point Estimate Approximations

An approximate alternative to full Bayesian inference uses for prediction, where is a point estimate.

The maximum of the posterior distribution provides the-so called maximum a posteriori (MAP) estimate,

\theta^* = \arg\max_{\theta} p(\theta|y) = \arg\max_{\theta} p(y|\theta) \, p(\theta)

If the prior is uniform, the MAP estimate is called the maximum likelihood estimate (MLE), because it maximizes the likelihood of the data . The MLE is popular among non-Bayesian statisticians because the prior may be dropped from the optimization because it only contributes a constant factor.

By definition, the unbiased estimator for the parameter is the expected value of the posterior,

\bar{\theta} = {\mathbb E}_{p(\theta|y)}[\theta] = \int_{\Theta} \theta \, p(\theta|y) \, d\theta

Point estimates may be reasonably accurate if the posterior has low variance. If the posterior is diffuse, prediction with point estimates tends to be underdispersed, in the sense of underestimating the variance of the predictive distribution. This is a kind of overfitting which, unlike the usual situation of overfitting due to model complexity, arises from the oversimplification of the variance component of the predictive model.

 
 

Things you can do from here:

 
 
Posted in MISC | Leave a comment

Be care of RangeQuery in Lucene

Reminder, Lucene has many Query types

– TermQuery, BooleanQuery,

ConstantScoreQuery, MatchAllDocsQuery,

MultiPhraseQuery, FuzzyQuery,

WildcardQuery, RangeQuery, PrefixQuery,

PhraseQuery, Span*Query,

DisjunctionMaxQuery, etc.

There is a bunch of Query implements in Lucene, which makes lucene very powerful in search. However, you should be very care of using Query like RangeQuery, especially when the size of your collection is very large.

As you know that lucene will rewrite the original Query, but some of the implement could be ineffective. Let's see the code snippet in RangeQuery first.

public RangeQuery(Term lowerTerm, Term upperTerm, boolean inclusive,

Collator collator)

{

this(lowerTerm, upperTerm, inclusive);

this.collator = collator;

}


public Query rewrite(IndexReader reader) throws IOException {


BooleanQuery query = new BooleanQuery(true);

String testField = getField();

if (collator != null) {

TermEnum enumerator = reader.terms(new Term(testField, ""));

String lowerTermText = lowerTerm != null ? lowerTerm.text() : null;

String upperTermText = upperTerm != null ? upperTerm.text() : null;


try {

do {

Term term = enumerator.term();

if (term != null && term.field() == testField) { // interned comparison

if ((lowerTermText == null

|| (inclusive ? collator.compare(term.text(), lowerTermText) >= 0

: collator.compare(term.text(), lowerTermText) > 0))

&& (upperTermText == null

|| (inclusive ? collator.compare(term.text(), upperTermText) <= 0

: collator.compare(term.text(), upperTermText) < 0))) {

addTermToQuery(term, query);

}

}

}

while (enumerator.next());

}

finally {

enumerator.close();

}

}

...............

}

As we can see from this the source code, a RangeQuery may be rewrited into thousands of TermQuery. This will make search ineffective, or even cause "TooManyClauses exception". In addition, the rewrite method in RangeQuery will traverse through the entire dictionary. This is another reason why RangeQuery would make the search operation slow.

In contrast to RangeQuery, RangeFilter will do this job faster. Although RangeFilter will also traverse through the entire dictionary,  it does not have additional search operation as RangeQuery.

The implement of RangeFilter in lucene  will not consume much memory. It will only used for approximate 12.5M memory for a collection with 10M documents. According to the statement above, I would recommend you to use RangeFilter rather than RangeQuery.

Actually, ConstantScoreRangeQuery is a wrapper of RangeFilter, which enables us to conduct range search.  ConstantScoreRangeQuery returns a constant score equal to its boost for all documents in the range. It's better than RangeQuery when we want to restrict the spectrum of the result rather than to rank the results partly according to the score by the RangeQuery.

Notes: The implements of FuzzyQuery, WildcardQuery, RangeQuery and PrefixQuery are pretty much the same, also be careful of using them.

Posted in Information Retrieval | Tagged , , | Leave a comment

中国莫道不消魂军队当年消灭了多少日军?

 
 

Sent to you by Jeffye via Google Reader:

 
 
via 王晓阳 by 王晓阳 on 8/13/09

中国莫道不消魂军队当年消灭了多少日军?

    中日战争,是令中国现代历史走向发生重大转折一个事件。其影响中国历史的程度,远远超过了辛亥革莫道不消魂命、五四运动、北伐战争等等。其结果对中国今天的时局仍在发挥重大影响。

     明天,815,是日本投降、二战结束纪念日。很多国家都会纪念,中国也不例外。问题是,谁有资格庆祝?

    过去,大陆的教科书一直说当年国民政府不抗日,只有中共抗日;后来,逐步承认国民政府是抗日主力。现在,又有一些人说中共当年根本不抗日。事实到底是怎么回事?

    “中共当年根本不抗日”的说法,过于武断了。因为当时苏联已经指示中共要抗日,中共也提出抗日口号“保卫苏联”,并且也确实打了一些仗。

    我们来让数字说话。如果连自己当年消灭了多少侵略者都说不清,那又怎么抢功劳呢?

    历史学家要有良心,要对得起那些长眠于地下的抗日先烈。 贪天之功者,要遭雷劈。

 

1,政府军和共人比黄花瘦军各消灭了多少日军?

    抗日战争期间,在华日军人数最多时有近200万,这个数字基本没有分歧。有分歧的,是有多少日军死亡。有多种数字。按照美国学者根据日本战中统计计算,在大陆被击毙的日军,共计44万余。研究抗战历史的专家张忠义先生,旁征博引日军史料,也得出一个接近的数字,45.5万人。国民党军参谋总长何应钦在《八年抗战》中公布的数字则为48万,而中国革莫道不消魂命军事博物馆则采用建国后综合统计后的数字55万。当然,也有对此持有异议的专家学者,比如社科院的刘大年教授,就根据国民党军战地统计数字计算,日军在中国阵亡人数超过100万人。

    必须说明的是,后来苏联为了抢地盘,急忙出兵中国东北,消灭了约60关东军,这属于两个强盗在中国抢地盘,不是功劳,更不能记在中国莫道不消魂军队的功劳簿上。并且,当时日本人以及一些海外学者认为东北是满州国,所以从来不把关东军的死亡数字统计在严格的中国战区。

    日本权威历史学家伊藤正德(《帝国陆军史》的作者)在他的书中,记录战死在中国的日军,共计789370。这个数字比较可信。

    当时,除了中国政府军外,只有共人比黄花瘦产党拥有军队了。那么,两者分别消灭多少日军呢?

    目前大陆的学者,有人倾向于认可伊藤正德的数据——人比黄花瘦产党领佳节又重阳导的武装,消灭了日军20万人;589370人,是政府军消灭的。中共人比黄花瘦军队消灭20万日本军人,不如大陆以前宣传的多,但是起码证明了“中共当年根本不抗日”的说法是错误的。中共当年的确抗日了

    也有人倾向于总共消灭日军44万, 国民革莫道不消魂命军消灭40万, 共人比黄花瘦军消灭2万, 其他死亡2万的说法。

    8年抗战才消灭日军最多不到79万人,很惭愧。苏联人到中国抢地盘的工夫,就消灭了60万日军。不过,考虑到当时中国莫道不消魂军队的武器装备,可以理解。

      另:国民革莫道不消魂命军远征军在缅甸等地消灭的日军人数,未统计在内。

 

2,政府军和共人比黄花瘦军的战报

    必须考虑到日军当年极力缩小自己对外公布的伤亡数字,而国莫道不消魂军、共人比黄花瘦军则要夸张自己的歼敌数字。

   八路军部分战绩与日军战报的对比
  1、平型关战斗
  八路战报:歼灭日军1000余人
  日军战报:日军亡167人,伤94人(儿岛襄著:《日中战争》,日本文艺春秋社1984年版)

  2、广阳伏击战
  八路战报:歼日军千余人
  日军战报:日军伤亡63人(臼井胜美著《中日战争》)

  3、晋察冀区反八路围攻
  八路战报:歼灭日伪军2000余人
  日军战报:日军亡17人,伤52人;皇协军伤亡69人(臼井胜美著《中日战争》)

  4、三次破袭平汉路
  八路战报:歼灭日伪军1200余人
  日军战报:日军亡2人,伤11人,无皇协军伤亡报告(《支那事变陆军作战》) 1938年

  5、冀中1938年春季反“扫荡”
  八路战报:歼灭日伪军1000余人
  日军战报:日军亡6人,伤26人, 皇协军伤亡71人(《华北治安战》)

   6、120师收复晋西北七城战役
  八路战报:歼灭日伪军1500余人
  日军战报:日军亡22人,伤51人,皇协军伤亡101人(《华北治安战》)

  7、易(县)涞(源)战斗
  八路战报: 歼日伪军1400余人
  日军战报:日军亡9人,伤22人,皇协军伤亡40人(《支那事变陆军作战》)

  8、129师晋东南反日军九路围攻
  八路战报:歼日伪军4000余人
  日军战报:日军亡11人,伤10人,皇协军伤亡79人(《华北治安战》)

  9、晋察冀区1938年秋反围攻
  八路战报: 毙伤日伪军5000余人
  日军战报:日军亡39人,伤132人,皇协军伤亡107人(臼井胜美著《中日战争》)

  10、冀中区五次反围攻
  八路战报:歼日伪军5500余人
  日军战报:日军亡21人,伤65人,皇协军伤亡99人(臼井胜美著《中日战争》)

  11、冀南1938年反“扫荡”
  八路战报: 毙俘日伪军600余人
  日军战报:日军亡3人,伤11人,皇协军伤亡16人(臼井胜美著《中日战争》) 1939年
 
  12、冀南春季反十一“扫荡”
  八路战报:歼日伪军3000余人
  日军战报:日军亡37人,伤70人,皇协军伤亡81人(臼井胜美著《中日战争》)

  13、115师陆房突围
  八路战报:毙伤日伪军1300余人
  日军战报:日军亡10人,伤122人,皇协军伤亡67人(《华北治安战》)

  14、五台山区1939年5月反围攻
  八路战报:歼灭日军宫崎部队800余人
  日军战报:日军亡4人,伤27人(《华北治安战》)

  15、太行区1939年夏季反“扫荡”
  八路战报:歼日伪军2000余人
  日军战报:日军亡7人,伤37人,皇协军伤亡70人(《华北治安战》)

  16、冀中1939年冬季反“扫荡”
  八路战报:歼日伪军2500余人
  日军战报:日军亡27人,伤89人,皇协军伤亡71人(《华北治安战》)

  17、北岳区1939年冬季反“扫荡”
  八路战报:毙伤日伪军3600余人
  日军战报:日军亡9人,伤34人,皇协军伤亡95人(《华北治安战》) [ 1940年

  18、平西区1940年春季反“扫荡”
  八路战报:歼灭日伪军800余人,击落日军飞机1架
  日军战报:日军亡8人,伤40人,皇协军伤亡22人(《华北治安战》)

  19、冀中1940年春季反全面“扫荡”作战
  八路战报:毙伤日伪军3000余人
  日军战报:日军亡11人,伤91人,皇协军伤亡62人(《华北治安战》)

  20、抱犊崮山区反“扫荡”(亦称鲁南区1940年反“扫荡”)
  八路战报: 毙伤日伪军2200余人
  日军战报:日军亡9人,伤60人,皇协军伤亡58人(《华北治安战》)

  21、129师白晋铁路破击战
  八路战报:歼日伪军600余人
  日军战报:日军亡2人,伤9人,皇协军伤亡12人(《华北治安战》)

  22、晋西北1940年夏季反“扫荡”
  八路战报:毙伤日伪军4490余人俘53人(内含日军11人)
  日军战报:日军亡37人,伤107人,失踪3人,皇协军伤亡失踪201人(《华北治安战》)

  23、冀中1940年夏季“青纱帐”战役 [
  八路战报:毙伤日伪军2100余人俘伪军500余人
  日军战报:日军亡19人,伤22人,皇协军伤亡39人(《华北治安战》)

  24、百团大战
  八路战报:毙伤日军2万余人、伪军5000余人,俘日军280余人、伪军1.8万余人
  日军战报:亡302人,伤1719人,皇协军伤亡失踪1202人(《华北治安战》)

  25、太行区1940年秋季反“扫荡”
  八路战报: 歼日伪军2800余人
  日军战报:日军亡29人,伤60人,皇协军伤亡44人(《华北治安战》)

  26、冀中1940年冬季攻势
  八路战报: 歼日伪军2300余人
  日军战报:日军亡10人,伤27人,皇协军伤亡59人(《华北治安战》)

  27、太岳1940年冬季反“扫荡”
  八路战报:歼日伪军260余人
  日军战报:日军伤7人,皇协军伤亡15人(《华北治安战》)

  28、晋西北1940年冬季反“扫荡”
  八路战报:毙伤日伪军2500余人
  日军战报:日军亡8人,伤44人,皇协军伤亡102人(《华北治安战》)
 

  莫道不消魂军方面
  1、凇沪会战
  国莫道不消魂军1937年战报:日军伤亡6万余人;孙元良个人在2005年估计日军伤亡4到5万。
  日军战报:日军在1937年公布自身死亡9115人,伤31157人,共计伤亡40672人.

  2、太原会战
  国莫道不消魂军战报:毙伤日军4万余人
  日军战报:日军伤亡2.6万余人(《中国事变陆军作战史》)

  3、南京保卫战
  国莫道不消魂军战报:毙伤日军1.5万余人
  日军战报:日军伤亡7600余人(《中国事变陆军作战史》)

  4、徐州会战
  国莫道不消魂军战报:毙伤日军5万余人
  日军战报:日军在1937年承认伤亡3.2万余人

  5、武汉会战
  国莫道不消魂军战报:毙伤日军20万余人
  日军战报:自身伤亡3万余人,因病减员6.7万余人(《中国事变陆军作战》)

  6、随枣会战
  国莫道不消魂军战报:毙伤日军4万余人
  日军战报:日军伤亡1.3万余人(《中国事变陆军作战》)

  7、枣宜会战
  国莫道不消魂军战报:毙伤日军2.3万人
  日军战报:日军伤亡9000余人(《中国事变陆军作战》)

  8、南昌会战
  国莫道不消魂军战报:毙伤日军1.2万人
  日军战报:日军伤亡9000余人(《中国事变陆军作战》)

  13、上高会战
  国莫道不消魂军战报:毙伤日军2万人
  日军战报:日军伤亡9000余人,病减员6000人(《中国事变陆军作战》)
 
  14、晋南(中条山)会战
  国莫道不消魂军战报:毙伤日军9900人
  日军战报:日军损失计战死670名,负伤2292名(《中国事变陆军作战》)

  15、第二次长沙会战
  国莫道不消魂军战报:毙伤日军2万余人(也有说4万)
  日军战报:日军伤亡7000余人(《中国事变陆军作战》)

  16、第三次长沙会战
  国莫道不消魂军战报:毙伤日军5万余人
  日军战报:伤亡6000人,其中死亡1600人(《中国事变陆军作战》)

  17、浙赣会战
  国莫道不消魂军战报:毙伤日军3万余人
  日军战报:日军伤亡17148人(《中国事变陆军作战》)

  18、鄂西会战
  国莫道不消魂军战报:毙伤日军4万余人
  日军战报:日军损失4000余人(《中国事变陆军作战》)

  19、常德会战
  国莫道不消魂军战报:毙伤日军5万余人
  日军战报:日军损失2万余人(《中国事变陆军作战》)
  20、豫中会战
  国莫道不消魂军战报:毙伤日军4000余人
  日军战报:日军损失3350人(《中国事变陆军作战》)
 
  21、长衡会战
  国莫道不消魂军战报:毙伤日军6万余人
  日军战报:日军损失6万余人(双方数字惊人的相似)(《中国事变陆军作战》)
 
  22、桂柳会战
  国莫道不消魂军战报:毙伤日军3万余人
  日军战报:日军损失1.6万余人(《战史丛书--大本营陆军部》) [23、缅北会战 [

  国莫道不消魂军战报:毙伤日军9万余人
  日军战报:日军伤亡4万余人(《中国事变陆军作战》)

  注:《中国事变陆军作战》和《支那事变陆军作战》,为同一本书,都是日本防卫厅在20世纪60、70年代编写的,是日本军事院校的教科书。 以上日方的资料全部来自日本国内。
    日本方面甚至清楚到每个伤亡的名字。可怜我们的无名英雄。

 

3,日军死亡的将领是被哪一方面消灭的?

    总的人数,容易滥竽充数。死亡的将军,就不容易造假了。我们来看数字:中日战争中,共129名日本将官阵亡,除去病死,自杀,飞机失事,死于苏蒙军、中美联合航空队之外,有50名将军死于中国莫道不消魂军队之手,其中死于国莫道不消魂45人,死于共人比黄花瘦5人,含一名刺杀身亡的

    与国民革莫道不消魂命军作战:
  林大八,陆军少将,1932年3月1日,死于上海。
  仓永辰治,陆军少将,1937年8月29日,死于上海吴淞。
  家纳治雄,陆军少将,1937年10月11日,死于上海。
  浅野嘉一,陆军少将,1937年11月14日, 战伤致死天津。
  加有暗香盈袖藤仁太郎,海军少将,1938年7月31日,死于长江下游 。
  杵春久藏,陆军少将,1938年8月2日,死于山西运城。
  饭冢国五郎,陆军少将,1938年9月3日,死于江西德安。
  小笠原数夫,陆航中将,1938年9月4日,坐机于湖北孝感被击毁。
  饭野贤十,陆军少将,1939年3月22日,死于南昌。
  山田喜藏,陆军少将,1939年5月12日,死于湖北大洪山。
  田路朝一,陆军中将,1939年6月17日,死于安徽南部。
  小林一男,陆军少将,1939年12月21日,死于内蒙古安北。
  中村正雄,陆军中将,1939年12月25日,死于广西昆仑关。
  秋山静太郎,陆军少将,1940年1月23日,死于山东。
  左藤谦,陆军少将,1940年3月2日,死于江西鄱阳湖。
  木谷资俊,陆军中将,1940年3月20日,死于江西。
  水川伊夫,陆军中将,1940年3月22日,死于内蒙古五原。
  前田治,陆军中将,1940年5月23日,死于山西晋城。
  藤堂高英,陆军中将,1940年6月3日,死于江西瑞昌。
  大冢彪雄,陆军中将,1940年8月5日,死于晋东南。
  井山官一,陆军少将,1940年10月16日,死于湖北宜昌。
  大角芩生,海军大将,1941年2月5日,坐机于广东中山被击毁。
  须贺彦次郎,海军中将,1941年2月5日 坐机于广东中山被击毁。
  上田胜,陆军少将,1941年5月13日,死于山西中条山。
  山县业一,陆军中将,1941年12月25日,死于安徽。
  酒井直次,陆军中将,1942年5月28日,死于浙江南溪。
  冢田攻,陆军大将, 1942年12月18日,死于安徽太湖。
  藤原武,陆军少将,1942年12月18日,死于安徽太湖。
  浅野克己,陆军少将,1943年5月,死于广东东江。
  仁科馨,陆军少将,1943年6月1日,死于湖南。
  黑川邦辅,陆军少将,1943年6月28日,死于云南。
  布上照一,陆军少将,1943年11月23日,死于湖南常德。
  中?护一,陆军少将,1943年11月25日死于湖南常德。
  下川义忠,陆军中将, 1944年4月19日,死于湖北应城。
  横山武彦,陆军中将, 1944年6月11日,死于浙江龙游。
  木村千代太,陆军中将,1944年6月11日,死于河南。
  和尔基隆,陆军少将 , 1944年7月21日,死于湖南衡阳。
  大桥彦四郎,陆军少将,1944年7月25日,死于湖南长衡会战。
  左治直影,陆军少将,1944年7月27日,死于湖北荆州。
  志摩源吉,陆军中将,1944年8月6日,死于湖南衡阳。
  藏重康美,陆军少将,1944年8月16日,死于云南腾冲。
  南野丰重,陆军少将,1944年9月8日,死于云南芒市。
  与野山寿,陆军少将,1945年2月9日,死于华中。
  山县正乡,海军大将,1945年3月7日,死于浙江椒江。

 

与八路军作战

  沼田德重,陆军中将,1939年8月12日,被八路军击伤死于山东。
  阿部规秀,陆军中将,1939年11月7日,与八路军作战死于河北涞源。
  吉川贞佐,陆军少将,1940年5月17日 被共人比黄花瘦产党员刺杀于河南开封。
  饭田泰次郎,陆军中将,1940年11月28,与八路军作战死于华北。

    吉川资,陆军少将,1945年5月7日,与八路军作战死于山东半岛。

  

    战争是要死人的,那么,国共人比黄花瘦军队各死亡多少,在抗日战争结束后,双方的军队又分别减少或增加了多少呢?下文叙述。

 

链接:

    《抗日战争:掉进了苏联陷阱》

       《多少中国莫道不消魂军人死于抗日战争?》

 

 
 

Things you can do from here:

 
 
Posted in MISC | Leave a comment

The internet at sort-of-40. How did we get here?

The internet at sort-of-40. How did we get here?

We're looking to compile a history of the internet, by the internet. Want to help?

Man holding up laptop displaying smiley face

Photograph: Microzoa/Getty Images

The internet is sort-of-40 this year. Not in the sense of aHollywood actor who is in reality much older but prefers to act vague, however. In the sense that if you set the October 1969 networking of US research universities through Arpanet as the start point then it is a significant birthday.

To mark this, we want to tell the internet's story. This is not the first time this has been done and will not be the last, but we want to tell the story of the internet using the internet – that is, the people who use it.

Below there is a list of 30 events from the past 40 years – encompassing the technological development of the internet and some of the impact it has had on culture, business, politics and society. Some of that makes for entertaining reading – reaction to the first piece of spam (a US army major gets involved) or the 1982 conversation that led to the first use of the :-) emoticon.

But these 30 events are not the only ones that mattered. There is no YouTube on here, nothing of Barack Obama's use of the web for fundraising – and that is intentional. We'd like to know what you think is significant.

At the bottom of this page is a form where we would like you to nominate events memorable to you, be they ones we may already know about or something more personal such as the first websites you used or emails you sent. Our list is, for example, light on social media moments or internet dating. Or the thrill of a first Geocities site.

Maybe you did some of this pioneering work in the early days of the internet and want to talk about it. Whatever your experiences, we'd like to hear from you.

Where will it end? Well, this is a work in progress. But we will publish updates to the list and this autumn hope to produce an impressive told-by-the people version of the internet story

And here is the list of 30 ...

1969 Arpanet starts Computers at two academic departments in California are linked by Arpanet, the predecessor of the internet
1971 @ Ray Tomlinson devises electronic mail for arpanet. He settles on @ to separate the name of the user from the name of their computer
1971 Project Gutenberg Michael Hart begins a project to make copyright-free works electronically available. The first text is the US Declaration of Independence, now archived as gutenberg.org/etext/1
1971 Expansion The network is now connecting 23 hosts
1973 ARPAWOCKY Early network humour: Twas brillig, and the Protocols / Did USER-SERVER in the wabe./ All mimsey was the FTP, / And the RJE outgrabe
1973 To Europe Norway is connected to Arpanet via Norsar, a US-Norwegian network to relay information on earthquakes and nuclear explosions. From Norway, a connection goes to University College London
1974 TCP/IP Vint Cerf and others publish a proposal to link up Arpa-like networks. It has no central control and is built around a protocol (TCP/IP) for the exchange of data
1976 Royal email Queen Elizabeth sends her first email on a visit to the MoD’s scientific research hub
1978 Spam Gary Thuerk sends what is now considered thefirst unsolicited commercial email. Major Raymond Czahor of the US defence communications agency assures Arpanet users it will not happen again
1978 Bulletin boards The first bulletin board is developed during a particularly bad blizzard in Chicago. Ward Christensen's creation allows computer users with a modem to talk to each other and exchange software and data
1982 :-) Scott Fahlman proposes the use of  :-) after a joke, beating off rivals including %, * and {#} - said to be 'like two lips with teeth showing between them'
1983 Internet begins? 1 January is the cut-off point for computers to use Cerf's transmission control protocol (TCP). Cerf estimates this involved between 200-400 hosts
1984 Lots more connections The number of hosts breaks 1,000, Japan establishes Junet, the UK begins Janet (the joint academic network) and the Soviet Union connects to Usenet.
1984 The Well It calls itself 'the primordial ooze where the online community movement was born'. A Guardian profile of The Well's co-founder Stewart Brand said it was 'where most of the discoveries of cyberspace were first made'
1985 .com The domain name that for many defines the web is created. The oldest .com registration still in existence belongs to Virginia-based Symbolics
1989 Start of the web Tim Berners-Lee proposes to his bosses at Cern a document retrieval system to run on the internet. His mechanism will use hypertext to make a file in one location appear as if it is in a window on another
1990 Archie Considered the first internet search engine, Archie is created by Canadian university student Alan Emtage. It allows users to match queries against file names (not the content of those files, that was still to come)
1990 Internet toaster A toaster becomes the first remotely-operated machine connected to the internet. A single control - power on or power off - is used to control grilling. It still requires a human to insert the bread
1991 First web page published The web goes public. Its first page explains it is a 'wide-area hypermedia information retrieval initiative'
1991 Webcam coffee coffee pot in a Cambridge University computer lab is the inspiration for the world's first webcam. It allows people in other parts of the building to avoid pointless trips when it is empty
1992 L0pht The Boston-based hacker collective is founded
1994 Yahoo! Jerry and David's Guide to the World Wide Web is launched. In time it is renamed Yahoo!
1995 Amazon.com The internet bookseller goes online. By the final quarter of 2001 it turns a profit - a little behind its plan for profitability within four to five years, but is still considered an exceptional dotcom performer
1996 Proto-Google Larry Page and Sergey Brin, PhD students at Stanford, begin work on BackRub, a search engine that ranks websites according to the number of links to them. It is incorporated asGoogle in 1998
1999 'Celestial jukebox' Shaun Fanning's Napster application launches. It allows users share music files on each others' computers
1999 MI6 names leaked The uncontrollable nature of the internet is brought to attention when the names of more than 100 MI6 agents are leaked to a US website. Despite being taken down, the names spread across other sites
2001 Wikipedia It proclaims itself a collaborative encyclopedia. Eight years after launch it is now the most popular reference work online
2001 SETI@Home A project to harness the distributed processing power of the internet gathers enough volunteers within four weeks to surpass the most powerful supercomputer of its time
2004 The war on spam Bill Gates tells the World Economic Forum at Davos that spam will be erradicated within two years. It isn't
2005 First spam conviction Jeremy Jaynes sentenced to nine years in prison and his sister, Jessica DeGroot, fined $7,500
2006 Twitter The 140 character service launches. Many who initially try it think it pointless. By 2009 it is credited with transmitting news of Iranian protests to the outside world

You may notice the launch of Twitter is the final item on this list. That is not to suggest that it is the final perfection of the internet (just to be clear).

Posted in Web | Tagged , | Leave a comment

The Ivory Toolkit with the SMRF Retrieval Engine (under Hadoop Framework)

As the Increase of IR dataset in size, it seems that a powerful platform for rapidly indexing and searching is need.  Ivory is a newly announced search platform developed on the basis of Hadoop. It could be a good choice when we come to billion era.

This would also be a future step for our SaberLucene Project (under release). Beside MapReduce framework, we would also like to integrate Indri Query Lanuage into SaberLucene. After these two major steps, we could expect a first release of SaberLucene. Any help will be appreciated.

-------------------------------------------

The Ivory Toolkit with the SMRF Retrieval Engine

Ivory is a Hadoop toolkit for Web-scale information retrieval research that features a retrieval engine based on Markov Random Fields, appropriately named SMRF (Searching with Markov Random Fields). This open-source project began in Spring 2009 and represents a collaboration between the University of Maryland and Yahoo! Research. Ivory takes full advantage of the Hadoop distributed environment (the MapReduce programming model and the underlying distributed file system) for both indexing and retrieval.

In order to temper expectations, please note that Ivory is not meant to serve as a full-featured search engine (e.g., Lucene), but rather aimed at information retrieval researchers who need access to low-level data structures and who generally know their way around retrieval algorithms. As a result, a lot of "niceties" are simply missing—for example, fancy interfaces or ingestion support for different file types. It goes without saying that Ivory is a bit rough around the edges, but our philosophy is to release early and release often. In short, Ivory is experimental!

Ivory was specifically designed to work with Hadoop "out of the box" on the ClueWeb09 collection, a 1 billion page (25 TB) Web crawl distributed by Carnegie Mellon University. The initial release of Ivory is meant to serve as a reference implementation of indexing and retrieval algorithms that can operate at the multi-terabyte scale. Another interesting experimental aspect of Ivory is it's retrieval architecture: we've been playing with retrieval engines that directly read postings from HDFS. The getting started guide with TREC disks 4-5 provides more details.

Download

Documentation

Posted in Information Retrieval | Tagged , , , , | Leave a comment