Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Tuesday 10 December 1996

LOGIC FOR HOW TO CONSTRUCT PHRASES?

·      A large no. of words (usually common nouns or adverbs) are accompanied by "adjectives"

·     Normally the objective precedes the verb or noun.

·     A bio - data (resume) is not a piece of fiction/ literature – hence

a.  Most frequently occurring "common nouns" & "adverbs" are perhaps no more than a few hundred
b.  Most frequently occurring "adjectives" which precedes these common nouns & adverbs are also no more than few hundred.
c.  Number of probable combinations of these words & these adjectives is also quite limited.
d.  Words - phrases - Sentences are largely "Statement of Facts" and no "figments of imagination" hence repetitive in nature and content
e.  The "sequences" in which these  words - phrases appear (to make up a sentence) is fairly "Well - defined" with very little "variations"
 All of the above makes it reasonably simple to devise a RESUME - GENERATING SOFTWARE. As mentioned in one of my earlier (concept) notes, what we need to do is to
·     Take a large number of biodatas
·     Scan / OCR/ Index all the words appearing in these bio datas
·     Study & record the "occurrences" (the frequencies) of
                     I.      Each word  (Verb - adverb - adjective - preposition noun)
                   II.      Set of any two words (prefixed / suffixed to a given word)
                 III.      Set of any three words
                  IV.      So on & So forth.
·     We have already created a "directory" of some 6052 words (out of a total 1 million words) which have, each occurred more than 10 times in 3500 converted bio - datas.
·     Soon I will send to you (indexed) words contained in 100 originally typed bio - datas for which scanned bit – map image files are already given to you on some 15/20 floppies.
·     If required, we can, everyday go on scanning 50 to 100 typed bio - datas (we have some 35000 typed biodatas available with us) and go on
    - Increasing the "Population" of words
    - Improving the "frequency" of occurrence
But ideal time launch this massive scanning operation (of 35000 bio - datas) is after the ARDIS & ARGIS are ready even in crude form. Then the massive scanning Operation would itself become a "self - learning / improving" exercise for the software, making it truly "intelligent". In the enclosed pages, I have, by studying some 30/40 biodatas manually, tried to figure – out.
   - What are commonly occurring adjectives before some nouns & adverbs?
   - What words appear, before (and in some cases "after") following prepositions?
·     OF
·     WITH
·     FOR
·     AT
·     ON
·     IN
·     AFTER
·     TO
·     BY
·     AS
·     SINCE
·     FROM
·     UNDER
·     OUT
To dissect / analyses even a few hundred bio - datas manually is a her clean task ! To study several thousand (bio     datas) this way is next to impossible ! But it is precisely such set and repetitive (logical) activity where the computer hardware & software triumph over human brain. Let us harness these technological advances and create something superior than that RESUMIX has done.

h.c.parekh

No comments:

Post a Comment