Freitag, 10. Juli 2009

Natural Language Machine Learning



Natural Language is aids communication and is comprised of words and sentances. The Words we use can be categorised into groups that correlate to our experience in our known environment.

Objects:
Locations:
Positions:
Times:
Perspectives:
Percepts:
States:
Actions:
Logic:

Whereas, in principle, any word can be in any category at any time, and can be in relationship to any other word in any category being absolute, relative or generalized at any time.

Examples:

Objects:
Name: Car, Book, Street, Phone, Leaf, Atom, Universe, Lamp , 1, 2, 3
Measurement: absolute, relative, generalized.

Locations:
Name: Berlin, Kitchen, Earth,
Measurement: absolute, relative, generalized.

Positions:
Name: Top, Bottom, Left, Right, Up, Down, Forwards, Backwards, Over, Under, Here, There
Measurement: absolute, relative, generalized.

Times (Past, Present, Future):
Name: Midday, 12, 2001, tomorrow
Measurement: absolute, relative, generalized.

Perspectives:
Name: I, You, me, He, She, Us, Them, We, Our, They, Their
Measurement: absolute, relative, generalized.

States:
Name: Hot, Cold, Happy, Wonderful, Nasty, Frozen, Shiny, Red
Measurement: absolute, relative, generalized.

Percepts:
Name: Feel, See, Hear, Taste, Smell
Measurement: absolute, relative, generalized.

Actions:
Name: Run, Lay, Sleep, Touch
Measurement: absolute, relative, generalized.

Logic:
If, then, else, and, or, not,

Helpers
Object Helpers

the -> suggests absolute individual object,
an / a -> suggests generalized object category, object,
of -> suggests object is grouped above, with or under other objects
with -> suggest object relationship to objects
among -> suggest object relationship with objects of the same category
in -> suggests object containment in other object or state

State Helpers
is -> suggests absolute present state
be -> suggests future state

Time Helpers
at -> suggests time

Position Helpers
to -> suggests position
at -> suggests position
between -> suggests position

Action Helpers
by -> suggest action
for -> suggests action

There are a number of helpers that suggest objects in a language:

Before a machine can start to understand the meaning of words an their context - it need to know what objects are. Babies spend most of time doing nothing else but experiencing objects, their states and positions using the percepts they have (see, feel, hear, smell & taste). This is great news for babies, but sucks for machines. I suspect it would be difficult to communicate with someone was never able to see, hear, smell, taste and the only feeling they have is the feeling of pressure in their right hand index finger which they can use feel brail letters running under their fingers.

Having said that - it is possible for machine to "understand" what you are talking about.

1. Being a baby.

- learning objects.

As we are lazy parents - our baby should learn this stuff on its own - by "reading" lots and lots of stuff.

Basically, all it needs to do is scan the net with an object rule an gather snippets of text.

Rule: (Object Helper) text (.,)(Object Helper)(Logic)

All this says is: collect text that starts with one of the object helper words and ends with . or , or one of the object helper words or one of the logic words.

Applying the rule to written text should result in a collection of snippets like this:

the United States.
The acknowledgment
a new diplomatic overture
the Obama administration
the long-running conflict
a CNN exclusive interview
a broker
the United States
the Taliban
the Obama Administration
the military fight against
the Taliban
a "stalemate"
a recent influx
American combat troops
the deadlock
the consensus
the region
the United States cannot win
the war
a resolution
the conflict
a political
a military victory
a resolution
the United States
the Pakistan military
the Inter-Services Intelligence directorate (ISI)
the first opportunity
a breakthrough
the Afghan war
the U.S. invasion
the fighters' alliance
the United States during
the Soviet war
the Pakistan military
the ISI
the forefront
the whole struggle
the contacts
the organizations like [Mullah Omar's Taliban and Gulbuddin Hekmatyar] doesn't mean
the funding
the 9/11 attacks Pakistani policy
the groups did
the state followed
the army followed
the ISI followed
the world shuts its last door
The communication
in Afghanistan
the ability
the Taliban

the United States

the warring parties,

the dialogue table

a former head

the ISI,
the CIA
the "Godfather
the Taliban
a terrorist
the complete withdrawal

As you can see the above was taken from a CNN news article. Using the object filter rule.

Now the point is we do not want to get into any language grammar rules or having to know what words mean to deduce some understanding of the language. We should be able to, with only the language helpers be able to understand (eventually) the natural language and deduce its grammar rules.

for example: from the above we can deduce the following are objects:

a terrorist, the taliban, the godfather, the CIA, the ISI, the contacts, the funding, the soviets, the forefront, the united states, a breakthrough, the involvement,

we can also deduce that:

terrorist, breakthrough – are generalized objects
taliban, godfather, CIA, ISI, contacts ect - are absolute objects or groups of objects

furthermore:

“followed” is not an object (because it was left on its own after removing the object from the snippet) so removing it entails that the following are also objects:

the army, the state, the ISI

As, “the army” is an object then Pakistan must also be an object.

After that – we are left with 3 pr more word sentences – lets look at the 3 word sentences:

The dialogue table
The warring parties
The united states

Either dialogue or table is an object, action, time, measurement, percept or state. But until we know more, we will just consider them an object in their own right.

We also have the snippet:

“In Afgahanistan”

We can deduce that Afganistan is either a state or an object.


Mittwoch, 3. Dezember 2008

Start/Stop Mysql on Leopard

sudo /usr/local/mysql/support-files/mysql.server start

sudo /usr/local/mysql/support-files/mysql.server stop

Donnerstag, 13. November 2008

Rspec installation

>sudo gem install ZenTest
>sudo gem install rspec-rails

in application root:
>ruby script/generate rspec

autotest (if it hangs use >RSPEC=true autotest)


rake spec
rake spec:app

rake spec:models
rake spec:controllers
rake spec:views
rake spec:helpers
rake spec:plugins
rake --tasks:plugins

Freitag, 7. November 2008

Compare time

module ApplicationHelper
..SOMETIME_FORMAT = "%a %b %d %H:%M:%S %z %Y"
..SOMETIME_FORMAT_DB = '%Y-%m-%d %H:%M:%S'

..def sometime_this_month(time, format='user')
....return time.beginning_of_month.strftime(SOMETIME_FORMAT) if format == 'user'
....return time.beginning_of_month.strftime(SOMETIME_FORMAT_DB) if format == 'db'
..end

..def sometime_today(time, format='user')
....return time.beginning_of_day.strftime(SOMETIME_FORMAT) if format == 'user'
....return time.beginning_of_day.strftime(SOMETIME_FORMAT_DB) if format == 'db'
..end

end


Time based finds.

created_at time from the database is: "2008-10-24 22:06:18 +0200"

However retrieving the Time.now results in the format as follows: "Fri Nov 07 15:31:56 +0100 2008"

end

Named scope with variables

in model *.rb (in this case weight_meassurements.rb)

named_scope :user_weight_meassurements, lambda { |user_id| { :order => 'created_at DESC', :conditions => ["user_id == ?", user_id] } }

named_scope :today, lambda { |today| { :conditions => ["created_at > ?", today] } }

Calling from the controllers or helpers @user.weight_meassurements.today('2008-11-07 00:00:00') wil retrieve all weight_meassurement record belonging to the user created after midnight of the passed date.

You can also call WeightMeassurements.user_weight_meassurements(@user.id) for all this user weight meassurement records.

Montag, 27. Oktober 2008

Mysql import

mysql -p -u{Benutzername} {DB-Name} < filename

Converting database time to human time

model.created_at.begining_of_day.strftime(%B, %Y)
>October, 2008
model.creted_at.strftime("%d").to_i.ordinalize
>25th

%a weekday name.
%A weekday name (full).
%b month name.
%B month name (full).
%c date and time (locale)
%d day of month [01,31].
%H hour [00,23].
%I hour [01,12].
%j day of year [001,366].
%m month [01,12].
%M minute [00,59].
%p AM or PM
%S Second [00,61]
%U week of year (Sunday)[00,53].
w weekday [0(Sunday),6].
W week of year (Monday)[00,53].
x date (locale).
%X time (locale).
%y year [00,99].
%Y year [2000].
%Z timezone name.