Notes: March 1997

Tuesday, 18 March 1997

SCOPE OF WORK

Cyril / Hugh/ Yogesh

The scope of work is schematically described in the enclosed diagram. The basis inputs are TEXTS of various types. The printed texts will be scanned and subjected to an OCR software to convert into ASCII files. Then, there will be texts received

ELECTRONICALLY such as

- Floppies

- Fax

- E : Mails

- Computer files (Dial - Up / Internet / Intranet)

In both the cases, the software will search, identify and pick - out KEYWORDS and place them in appropriate BINS/ FIELDS, based on the "meaning" of each keyword. In this way, the software will Create a database (or several databases). This basic process will remain the same irrespective of the type / size / structure / format of the document being scanned (whether printed or electronically received document).

Of course, all the keywords picked - out from a given document will be linked to that document (identified as belonging to that particular document).

Having created a DATABASE, it should be capable of being QUERRIED, by using any of the KEYWORDS (one or more, in the AND/ OR fashion). For each keywords "SYNONYM RINGS" WILL have to be created.

The search will produce a short list of all the records (documents) where such a designated / specified KEYWORDS appears. We should be able to VIEW all such records on the Screen (one by one) or be able to take print - outs.

By using a computer - memorized STANDARD - LETTER, we should be able to send - out the short - list to a given client.

Besides responding to a QUERRY shot from one of our own LAN Nodes, the software must permit a client to shot such a query from his own office computer by remotely logging - on to our Server thru a dial - up Modern or thru internet connection. This feature (of remote query) is absolutely essential. Of course, we must provide for a password within a password within a password (!) to ensure data - security.

of course, what part of the database each user will be allowed to access (locally or remotely) will be strictly defined in advance and rigidly administered. The users are :-

- Self

- Associated (e.g. Mankodi/ Gangolli etc)

- Candidates

- Clients

- Foster Partner Member and may be

- Anyone from Public.

What is expected of the software is REMOTE ACCESS CAPABILITY. And that should be built into the Software RIGHT NOW.

However,

Which user will be allowed to remote access

What databases and shoot

What type of quarries and

When (point of time)

Will be spread - out over next 2 /3 years.

However, in the first phase itself.

We want

1. Candidate to enter and modify their own biodatas remotely (This is the ONLY WAY, we can hope to build up a

- Large candidate database

- Quickly

- Without hiring an army of data - entry operators or persons to scan typed bio - datas.

2. Candidates to be able to shoot queries to JOB BULETIN BOARD. This is the MOST POWERFUL VALUE - ADDED SERVICE (VAS) as far as employment / change - seeking executives is concerned. This VAS will be available only to those executives who register / enroll with us by

- Remotely entering their biodatas

- Sending biodatas on floppies.

Once again, the idea is to transfer the data - entry burden onto the candidate himself.

3. Clients to be able to shoot "Executive Search Queries" remotely (from their own computers) exactly the way we would have done locally. The software will search and tell the client (or potential client) "How many suitable candidates we have in our database? - a number" Nothing more ! But this is a great way of HOOKING him ! If his need is really SERVIOUS, he would, next day, send a cheque (advance) along with search Request !considering that it would

- Cost him over a lakh of rupees to adventure that vacancy

- to take him 8 weeks to get response,

The bait of REMOTE ACCESS/ IMMEDIATE ACCESS/ CHEAP ACCESS is simply irrestible !

This is an IMPORATANT aspect of the cost-Benefit analysis of the software to be developed .

The rest of the REMOTE ACCESS applications can wait for 2/3 years – as each database gets built-up by scanning thousand of pages of other documents, some of which are enclosed herewith.

h.c.parekh

Tuesday, 4 March 1997

FELYNX LETTER/ OFFER

"What" needs to be included in the "Scope of Work"

· Besides "Typed" bio dates / resumes, the software shall also take care of

- All other typed / printed documents as shown on "chart" (dt. 5-3-1997)

- All "electronic" documents.

· PEN will be automatically/ serially allotted to bio - dates. In similar fashion, some other unique number must be automatically allotted to other documents/ files.

· Search - engine shall allow a "text - based" query without having to "codify" any key - word / field. The query shall use plain English language words/ phrases / sentences with "and / or" and "exclude" options.

· Software shall automatically create "Reference - files" of all "Variations" in which persons write / type/ speak a keyword. Software shall treat all such "variations" as "Synonyms" and conduct search on all "Synonyms"

Other/ Notes / Points

· Since we have already decide to change - over to

- Operation System - Windows NT

- RDBMS - MS - SQL

the software / search engine to be developed shall work on these. It shall however, be capable of being ported to other improved OS / RDBMS.

· Since FELYNX quotation includes the cost of "TOOLS/ COMPILERS" (app. costing of Rs. 2.0 Lakhs), 3P will not be required to incur any cost on these.

· Accuracy of Performance :-

80 % accuracy - End of phase 3 (i.e. installation of alpha version)

90 % accuracy - End of phase 4 (i.e. deployment & debugging)

98 % accuracy - Within 12 months of installation of alpha version.

INTEGRATION

Without any "hiccups", the software shall seamlessly integrate with

· Speech Recognition Software’s (including Voice - Mail and IVRS - Interactive Voice Response Systems available in the market)

· Other Business Application Software Modules that we may ask you to develop in course of time.

e.g.

- Order Execution Module

- Finance / Accounts / Billing Module

- Other standard "Bought - out" Software packages.

ENCRYPTION

Software shall automatically encrypt the databases generated by it so that no one can benefit by pilfering it.

FIREWALL

Since the software will reside on a client - server network (Intranet / Extranet / Internet), the software must incorporate a fool - proof FIREWALL protection against infiltration / virus attack.

TIME - FRAME

Phase 1 .......... End Dec. 1997

Submission of a detailed system Design Document

Phase 2

Detailed coding......... End Jan 1998

Phase 3

Install alpha version....... End Feb 1998

Phase 4

Install Beta Version........ End April 1998

Phase 5

Complete documentation &........... End. May 1998 training.

h.c.parekh

Monday, 3 March 1997

SCOPE OF WORK - KEYWORD

Yogesh

The scope of this project shall comprise development of a computer software.

TASK # 1.

The software shall be capable of reading typed bio datas / resumes and pick - out all the KEYWORDS.

Having picked - out the KEYWORDS, the software shall assign each KEYWORD to a specific FIELD (BIN) to which the KEYWORD belongs. While doing so, each keyword and this field will be linked to the particular biodata (permanent Executive Number). In this way, the software shall create a DATABASE of keywords which are

- Linked to specific executives (PEN)

- Linked to Specific FIELDS.

This means developing a SEARCH ENGINE for

- Searching - out each keywords

- Deciphering each keyword and deciding what it "means" (to enable placing it in appropriate field).

The SEARCH - ENGINE shall also be able to carry - out reverse - search i.e.

Given one or more KEYWORDS (under which to search), the search engine should be able to identify/ list all executives (PEN) against whose name these keywords appear (i.e. existing in their biodatas)

TASK # 2

The second task of the software would be to recreate / regenerate the biodata (of any executive) in a specific format called CONVERTED BIODATA.

The converted bio data would have the option of printing or not printing.

- Executive Name

- Current Employer - Name

and,

to be replaced by appropriate alternate CODES / descriptions.

In the development of this software, the following may be assumed.

a. Use of existing scanner (HP - Model 3P) & Scanning Software

b. Use of existing OCR software.

(OMNIPAGE 5.1)

Of course the software shall be capable of using

- High capacity scanners

- Improved versions of OCR Softwares.

- DMP / Inkjet/ Laser Printers & even line - printers.

INPUTS

The basic input will be any typed bio - data.

In phase #2, we would like you to consider any printed Advt (Job. Advt.) also as an INPUT. This would be for the purpose of creating a DATABASE of JOB BULETIN BOARD.

This is because the basic capability of the search engine shall remain the same, irrespective of whether it is working on

- a bio data

- a job - advertisement.

This basic ability is

- To pick - out KEYWORDS

- Place them in appropriate FIELDS

- Rearrange these in some formatted/ tabular OUTPUT statement

- Subject them to be "Searched" given certain SEARCH PARAMETERS (which will be Keywords)

- Create a "listing" of "Records Found"

OUTPUTS

As mentioned earlier, the desired outputs are

- A database of keywords, deciphered & stored in appropriate FIELDS

- A converted bio - data OR

- A converted (Tabulated) job advertisement

PLATFORM & UPGRADIBILITY

A. Hardware

The software should be designed for a pentium based server & 486 based LAN. Of course, these will be upgraded in course of time

B. Software

You may use _________ language and careate a database (RDBMS) ____________Please configure around ___________ Operating System, with a provision to change - over to ________ in course of time.

INTEGRATION

The software should be capable of seamless integration with our existing software, which are as follows.

We are also planning to incorporate OS - 2 / WARP 4 (IBM) Speech recognition Software on our LAN.The software should be capable of through integration with this or any other improved voice - recognition software.

PERFORMANCE ACCURACY

After installation and debugging the software should be capable of correctly picking - up (identifying) and deciphering (i.e. to which field does the keyword belongs) at least 80% of the keywords Appearing in each biodata / Job advertisement.

Within 6 months of the installation, this figure should go - up to 90% and within 12 months upto 98%

TIME - FRAME

For Installing

The software shall be installed on our system within 6 months from the date this agreement. The detailed time frame is shown at Annex : _________

For debugging

The debugging shall be carried - out within 2 months of the date of installation.

FUTURE SUPPORT

We are looking at a long term relationship with FELYNX. we, therefore expect that, we will continue to receive will support from you in future as far as

- Development of new modules of software

- Maintenance of the software being developed.

SECRECY / EXCLUSIVITY

After installation and debugging you will hand over the SOUCE CODE to us, which will become our "Intellectual Property". You will not sell this software or part with it in any manner what - so - ever,to any other person or organisation, at anytime. You will not share or pass - on to any other person. Or Organisation, any infomration given to you or collected by you relating to our business, including any of our future plans that you may come to know of...

h.c.parekh

DEFINITION BIO-DATA (RESUME)

Yogesh

Samples of type’s bio - dates are already given to you. These are quite typical. Some 5% - 10% of the biodatas arrive on fax. The software should be able to take care of these as well.

Before long we shall start receiving bio - dates.

· On E : Mail

· On Floppies

· By remote "log - in" by candidates

· Candidates telephonically "dictating" to 3P staff

· By VOICE - MAIL.

The software should be able to take care of these as well.

While running OCR software (on scanned bio - dates), there is a possibility that 80% ASCII character come - out OK leaving 20% "errors" (spelling mistakes). Normally, these errors are corrected by Data Entry operators on the screen. This operation slows down the entire process. We would like to eliminate / avoid this operation if possible.

You should examine the possibility of your software to take care of this.

KEY - WORDS

To identify & decipher all grammatically defined words such as

- Verb

- Adverb

- Preposition

- Adjectives

- Nouns (Common Nouns/ Proper Nouns)

FIELDS

To identify & decipher Adjectives and Nouns (whether common noun or Proper Noun) which are BIODATA RELATED. These include:-

· Name of Candidate

· Name of Boss/ Colleague Subordinate

· Name of Companies.

· Addresses (Several Types) PIN

· Locations / Cities / Countries

· Birth - Date / Age

· Industry / Employer

· Phone No’s (incl. STD/ ISD Codes)

· Dates (of joining / leaving etc. )

· Duration of employments’ (Periods)

· Edu. Qualifications / Years of passing / Colleges / Universities

· Salary

· Designations / Positions held

· Departments

· Functions

· Skills

· Knowledge

· Attitudes

· Codes

· Equipments

· Products

· Raw materials

· Manufacturing - related

· Management - related

· Engineering - related

· Techniques

· Processes

· Languages etc.

The above is not a comprehensive list. It is only indicative. A comprehensive list prepared, over a period of time, by scanning thousands of bio - datas. This could be done by scanning over 40000 bio - datas lying with us. Even if most of these are OBSOLUTE, the keywords contained in these are not obsolute, so these could be a good starting point for building a comprehensive list.

h.c.parekh