We can divide the NIST-MEP project into three modules for ease of design and implementation.
NIST-MEP project will be divided into three phases:
- Phase-1: Design and implement a prototype search engine considering UTA profile system
- Phase-2: Design and implement module for integrating other partner institutions besides UTA. In this phase we will concentrate on developing the Web Services module
- Phase-3: Design and implement module to include non-academic institutions.
- Adding non- academic companies and including their data within the search domain is the objective of this phase.
- The UI portion for adding needs and adding company profile will be finalized in this phase
- Developing TMAC Form module is achieved in this phase
Phase – 1 is the most challenging part of the project and it will be designed and implemented ahead of other phases.
Some high level design idea about Phase – 1 and Phase – 2 is given below:
In this phase we have to first achieve the insight on implementing the search engine. With this end in view, our plan is to realize a working prototype of the system using pen and paper, and later implement a prototype following the design. We have to take care of the following issues:
i. Conduct research on Keyword search techniques and apply it for extracting keywords from a bunch of paragraphs used as a mock user input.
ii. Define rules for matching keywords with the records in the database. We will develop an early database design to serve this purpose and use mock data to test.
iii. Device suitable ranking algorithm for ranking the search results meaningfully.
Once we get the bigger picture, our next step is to implement a prototype of this search engine using mentis framework. The UTA profile system will be used as the source of data for the database.
Once we get good and consistent result using UTA profile system, we will integrate data from other partner institutions. This will be considered in Phase-2
One of the concern of the NIST/MEP project is that how would we crawl the different web pages of collaborative partners to get the data. Different institution might have different structure for their web page. Besides, an institution can change the web page structure. So writing a web crawler to get the data might not be a good idea as the maintenance cost will be huge. Hence we are considering a RESTful approach to get the necessary information from the partners.
We will use a RESTful architecture to get the data from the collaborative partners. Some high level idea about the structure is as follows:
- A REST service will be at our side which will get the data from collaborators and store in the local database.
- Each of the collaborative partners will have a client at their end which will provide web service.
- The web service at the client side will interact with the REST service residing at server side.
- There will be a CRON JOB running at the server side which will run at a regular interval and pull data from the web service of individual client.
- REST service will get the data from CRON JOB and store the data in the local database
- Whenever there will be a change in the client’s database, the web service will push the update to the server’s REST service. In this way, the server will always hold up-to-date information about the clients.
- There will be authentication mechanism implemented at REST service end of the server and web service end of the client so that no intruder can interfere. OAUTH can be used to implement the authentication process.
The web service at the client side will act as a translator which will convert the data from the client’s database to our pre-defined standard and vice versa. There are some options regarding this web service:
- If the client already has some sort of web service at their end, they can provide us our required data in a standard format
- If the client does not have a web service but has a development team we can provide them an interface which they will implement to meet the needs
- If the client does not have any development team we can develop the web service for them
We will also have to consider developing some adapter for connecting to the different databases of the client.