Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Trading Online Trading India Internet Trading Net Trading e Trad...
Category:
Finance / Investment  

Protect Your Home with Spy Camera
Category:
Home And Family  

7 Cost Effective Marketing Tips
Category:
Business  

How to Make a Free Web Site
Category:
Business  

Advertising Corporate Identity through Logo Design
Category:
Business  

Popcorn and Other Marketing Mistakes In a Changing Economy
Category:
Business  

Affiliate Marketing A business Without Hassle
Category:
Marketing  

Find Discount Scuba Diving Vacation Popularity Of Destination
Category:
Travel  

5 simple ways to get kick ass ideas for your articles
Category:
Business  

Global warming Should we heed the harbingers of doom
Category:
Home And Family  

Starting an Ebook Online Business in Just 3 Easy Steps
Category:
Business  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Double Your Dish Network Affiliate Check
Category:
Marketing  

Going to the Beach Lose Up to 20 Pounds In Less Than 2 Weeks
Category:
Health / Fitness  

Tips On Getting A Suntan
Category:
Health / Fitness  

CHOOSING A LABEL PRINTER
Category:
Business  

Adverse Credit Credit Cards
Category:
Business  

mouth watering lobster recipes
Category:
Health / Fitness  

importance of food elements
Category:
Health / Fitness  

Blood Test To Predict Risk of Heart Disease For Diabetics
Category:
Health / Fitness  

How to Create a Money Magnet E commerce Web Site
Category:
Marketing  

10 Offline Tightwad Marketing Strategies to Help You Get More Cl...
Category:
Business  

Decent Acne Medicines
Category:
Health / Fitness  

Role play with added sex appeal
Category:
Health / Fitness  

Grow a Healthy Lawn You Can Do That
Category:
Home And Family  

Stock Images The Indispensable Tool For Designers And Webmasters...
Category:
Marketing  

Easy Work From Home Ideas Quickstarts For Everyone
Category:
Business  

Tips for Your Walking Program
Category:
Health / Fitness  

Everything About Arthritis
Category:
Health / Fitness  

A Gentle Warning To All Webmasters About RSS
Category:
Marketing  

15 Ways To Sell Yourself Effectively In A Job Interview Part Thr...
Category:
Business  

2 Ways Online Web Conferencing Can Save Your Business Money
Category:
Business  

Lighting Your Way to Outdoor Living
Category:
Home And Family  

7 Rules Every Salesman Should Follow
Category:
Business  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Nurses Wanted Incredible Career Opportunities in Nursing Today
Category:
Health / Fitness  

Baby Wont Sleep Here s some helpful advice
Category:
Home And Family  

Why Cotoneaster Makes a Good Bonsai Candidate
Category:
Home And Family  

Home Hair Care Tips for Dry Hair
Category:
Health / Fitness  

A Home Gym and Walking a Great Exercise Program
Category:
Health / Fitness  

Preparing For Cosmetic Plastic Surgery
Category:
Health / Fitness  

Avoiding Razor Burn
Category:
Health / Fitness  

Curcumin An Anti Aging Herbal
Category:
Health / Fitness  

Take You Russian Fiance to an American Wedding Before You Get Ma...
Category:
Travel  

How and Why to Get an Awesome X Box 360 Skin for your XBOX Conso...
Category:
Entertainment / Television  

Where Are All of The Best Job Search Engines
Category:
Business  

The Power of Intention
Category:
Health / Fitness  

Traditional Therapies Can Prevent Heart Disease Too
Category:
Health / Fitness  

Handling devil Boss II
Category:
Home And Family  

10 Tips when using electronic forms
Category:
Business  

Mens Jewellery Snap Style Guide on Wearing Jewellery
Category:
Home And Family  

6 Things to Consider When Naming Your Baby
Category:
Home And Family  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Stevie Wonder Challenges Memphis and the World
Category:
Entertainment / Television  

Writing the Resource Box so it Makes People click
Category:
Marketing  

Weight Loss Psychology
Category:
Health / Fitness  

Australia Visa Services Free Online Australian Immigration Asses...
Category:
Travel  

The Truth About Passive Income
Category:
Finance / Investment  

A New Way of Looking at NJ Divorce
Category:
Finance / Investment  

Can Stress Play a Role In Hair Loss
Category:
Health / Fitness  

Tips to Selecting an RSS News Aggregator
Category:
Computers  

WHY LABEL PRINTERS STAY SO BUSY
Category:
Business  

No Win No Fee Compensation Claims No Risk No Costs
Category:
Finance / Investment  

Why Heart Fails
Category:
Health / Fitness  

Find The Best Compensation Claim Specialist
Category:
Business  

Starting up a business in the 21st century
Category:
Business  

The Benefits of Press Releases
Category:
Business  

Tips on Improving the Positioning of your site on the Major
Category:
Computers  

Cheap Christmas Present
Category:
Home And Family  

How can a piece of article boost your marketing efforts
Category:
Marketing  

Philadelphia s Four Seasons Hotel For Business Vacations Or Wedd...
Category:
Travel  

7 Skin Care Tips Look Stunning in Your 50s
Category:
Health / Fitness  

Exercise Why Bother
Category:
Health / Fitness  

Frugal Living Money Making Ideas for Stay at Home Moms
Category:
Home And Family  

Internet marketing tips to help your business grow
Category:
Marketing

System and method for developing and interpreting e-commerce metrics by utilizing a list of rules wherein each rule contain at least one of entity-specific criteria Number:7,013,323 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: System and method for developing and interpreting e-commerce metrics by utilizing a list of rules wherein each rule contain at least one of entity-specific criteria

Abstract: A system, method and computer program product for developing and interpreting e-commerce metrics is disclosed. The method involves collecting pages that are commonly transmitted over a computer network (e.g., the Internet, an institutional intranet, etc.), where the pages are relevant to the business operations of an entity, collecting external data, which may or may not be available on the computer network, but that is highly relevant to the entity, processing the collected pages with additional information such as contact information, routing tables, financial information, and other data which does not need to be collected more than once, and scoring the pages based on all the information collected to determine statistics. The statistics are analyzed for business information which may be important to the operations of the entity. The method then produces a report to deliver a continuous stream of e-commerce intelligence for the entity.

Patent Number: 7,013,323 Issued on 03/14/2006 to Thomas,   et al.


Inventors: Thomas; Jason B. (Arlington, VA); Bildner; Mark J. (Alexandria, VA); Thomas; Brandy M. (Arlington, VA); Young; Christopher D. (Washington, DC); Moore; Richard P. (Potomac Falls, VA); Biro; Ross A. (Alexandria, VA); Pemberton; Alissa S. (Washington, DC); Perlman; Diane B. (Silver Spring, MD)
Assignee: Cyveillance, Inc. (Arlington, VA)
Appl. No.: 576896
Filed: May 23, 2000

Current U.S. Class: 709/203; 709/224; 709/219; 707/4; 707/5; 707/6; 707/7; 707/10
Current Intern'l Class: G06F 15/16    (20060101)
Field of Search: 709/218,223,224,203,217,219 707/1,3-7,10


References Cited [Referenced By]

U.S. Patent Documents
5659732Aug., 1997Kirsch.
5931907Aug., 1999Davies et al.
5933822Aug., 1999Braden-Harder et al.
5963965Oct., 1999Vogel.
6289341Sep., 2001Barney.
6321228Nov., 2001Crandall et al.
6377961Apr., 2002Ryu.
6442606Aug., 2002Subbaroyan et al.
6480835Nov., 2002Light.
6480837Nov., 2002Dutta.
6519586Feb., 2003Anick et al.
2001/0044795Nov., 2001Cohen et al.
2002/0147880Oct., 2002Wang Baldonado.
2002/0169694Nov., 2002Stone et al.
2003/0149684Aug., 2003Brown et al.

Primary Examiner: Najjar; Saleh
Assistant Examiner: Duong; Oanh
Attorney, Agent or Firm: DLA Piper Rudnick Gray Cary US LLP

Claims



What is claimed is:

1. A method for developing and interpreting e-commerce metrics of an entity, comprising the steps of:

(1) collecting pages that are commonly transmitted over a computer network;

(2) receiving a list of predetermined, entity-specific criteria defining information relevant to the entity;

(3) receiving a first set of rules related to entity-specific criteria defining information relevant to an entity;

(4) determining whether each of said pages satisfies each of said first set of rules therefore obtaining a first subset of said pages;

(5) parsing content of said first subset of said pages using a second set of rules inclusive of said first set and adding rules related to searching for at least one key word in at least one predetermined category of key words, thereby obtaining a second subset of said pages;

(6) scoring said second subset of said pages utilizing a third set of rules incorporating analyzed statistics based on said first and said second set of rules and incorporating additional information; and

(7) generating a report utilizing a fourth set of rules prioritizing results of said second and third set of rules, including said analyzed statistics and said additional information;

such that said report is utilized to aid an entity in doing business over said computer network.

2. The method of claim 1, wherein said computer network is the global Internet.

3. The method of claim 1, wherein said computer network is an intranet.

4. The method of claim 1, wherein said computer network is an extranet.

5. The method of claim 2, further comprising the steps of:

(8) obtaining contact information for said report.

6. The method of claim 2, further comprising the step of:

(8) generating said report listing scores.

7. The method of claim 1, wherein step 6 comprises the steps of:

(a) compiling statistics from said second subset of said pages;

(b) storing said statistics; and

(c) analyzing said statistics by combining said statistics, said second subset of said pages and said additional information.

8. The method of claim 7, further comprising the steps of:

performing step (a)-(c) for a plurality of entities.

9. A system for developing and interpreting e-commerce metrics of an entity, comprising:

a downloader for searching a computer network, wherein said computer network contains pages;

a page processing module for receiving said pages downloaded from said search of said computer network, said page processing module utilizing a first set of rules related to entity-specific criteria defining information relevant to an entity and forming a first subset of pages;

an archive for storing said first subset of said pages, said pages being downloaded to said archive by said page processing module; and

a database for allowing said page processing module to perform queries of said pages from said first subset of said pages, stored on said archive, in order to produce a report, said report comprising:

parsed content of said pages generated utilizing a second set of rules inclusive of said first set and adding rules related to searching for at least one key word, whereby the parsed content is parsed with at least one predetermined category;

scored pages generated utilizing a third set of rules incorporating analyzed statistics based on said first and said second set of rules and incorporating additional information; and

pages prioritized utilizing a fourth set of rules prioritizing contents of said report utilizing results of said second and said third set of rules including said analyzed statistics and said additional information;

such that said report is utilized to aid an entity in doing business over said computer network.

10. The system of claim 9, wherein said computer network is the global Internet.

11. The system of claim 9, wherein said computer network is an intranet.

12. The system of claim 9, wherein said computer network is an extranet.

13. The system of claim 10, further comprising:

a plurality of Web clients that provides a graphical user interface for a user to enter search criteria and communicate with said downloader, thereby controlling said page processing module.

14. A computer program product comprising a computer usable medium having computer readable program code means embodied in said medium for causing an application program to execute on a computer that develops and interprets e-commerce metrics of an entity, said computer readable program code means comprising:

first computer readable program code means for causing the computer to collect pages that are commonly transmitted over a computer network;

second computer readable program code means for causing the computer to receive a first set of rules related to entity specific criteria defining information relevant to the entity;

third computer readable program code means for causing the computer to determine whether each of said pages satisfies each of said first set of rules therefore obtaining a first subset of said pages,

fourth computer readable program code means for parsing content of said first subset of said pages using a second set of rules inclusive of the first set and adding rules related to searching for at least one key word in at least one predetermined category of key words, thereby obtaining a second subset of said pages;

fifth computer readable program code means for scoring said second subset of said pages utilizing a third set of rules incorporating analyzed statistics based on the first and the second set of rules and incorporating additional information; and

sixth computer readable program code means for causing the computer to generate a report utilizing a fourth set of rules;

contents of the report utilizing results of the second and third set of rules including the analyzed statistics and said additional information;

such that said report is utilized to aid an entity in doing business over said computer network.

15. The computer program product of claim 14, wherein said computer network is the global Internet.

16. The computer program product of claim 15, further comprising:

seventh computer readable program code means for causing the computer to obtain contact information for said report.

17. The computer program product of claim 15, further comprising:

seventh computer readable program code means for causing the computer to generate said report listing said scores of said subset of pages.

18. The computer program product of claim 17, wherein said fifth computer readable program code means comprises:

seventh computer readable program code means for causing the computer to compile statistics from said pages;

eighth computer readable program code means for causing the computer to store said statistics; and

ninth computer readable program code means for causing computer to analyze said statistics by combining said statistics, said pages and said additional information.

19. The computer program product of claim 18, further comprising tenth computer readable program code means for causing the computer to perform the seventh, eighth, and ninth computer readable program code for a plurality of entities.
Description



CROSS-REFERENCE TO RELATED APPLICATION

This application is related to the following commonly owned, co-pending applications:
    • "System, Method and Computer Program Product for an Online Monitoring Search Engine", by Thomas, having application Ser. No. 09/133,374, filed on Aug. 13, 1998, which is incorporated herein by reference in its entirety; and
    • "System, Method and Computer Program Product for Analyzing E-Commerce Competition", by Thomas et al., having application number TBA Ser. No. 09/576,895, filed concurrently herewith, which is incorporated herein by reference in it entirety.


  • BACKGROUND OF THE INVENTION

    1. Field of the Invention

    The invention relates generally to computer network search engines, and more particularly to search engines for performing online monitoring activities.

    2. Related Art

    Over the past several years, there has been a large growth in the number of computers, and thus people, connected to the global Internet and the World-Wide Web (WWW). This collective expansion allows computer users to access various types of information, disseminate information, and be exposed to electronic commerce (e-commerce) activities, all with a great degree of freedom. E-commerce includes large corporations, small businesses, individual entrepreneurs, organizations, and the like who offer their information, products, and/or services to people all over the world via the Internet.

    The rise in use of the Internet, however, also has a negative side. Given the Internet's vastness and freedom, many unscrupulous companies, organizations and individuals have taken the opportunity to profit by diverting customer traffic, misusing product information, and mis-associating their product or company with others. For example, it has been estimated that millions of pages employ tags and text designed to divert searchers to their sites when the Internet users actually searched for something else. These diversions and incidents of misinformation cause a loss of business. Also, an individual, company, organization, or the like may be concerned with other violations such as the illegal sale of their products, or the sale of inferior products using their brand names. Furthermore, an individual, a company, an organization, or the like may be concerned with false information (i.e., "rumors") that originate and spread quickly over the Internet, resulting in the disparagement of the entity. Such entities may also be interested in gathering data about how they and their products and/or services are perceived on the Internet (i.e., a form of market research).

    In order to compete with the above-described aspects of the Internet, entities are currently forced to search Internet resources (i.e., Web sites, File Transfer Protocol (FTP) sites, newsgroups, chat rooms, etc.), by visiting over thousands of sites in order to discern activities relevant to their business operations. Such searching is currently done either by hand or using commercial search engines. Each of these methods is costly because a great amount of time is required to do such searching—time that detracts from positive, profit-earning activities. Adding to the frustration of discerning relevant activity is the fact that commercial search engines are updated infrequently and typically limit the resulting number of sites (i.e., "hits") that any given search request returns. Furthermore, the task of visiting each site to determine whether there is indeed relevant activity and if so, the extent and character of it, also demands a great deal of time.

    Therefore, in view of the above, what is needed is a system, method and computer program product for developing and interpreting e-commerce metrics. Such e-commerce metrics can provide relevant market information and feedback to an entity so that it may detect and prioritize its online business efforts. Further, what is needed is a system, method and computer program product that searches the Internet's vast resources for data relevant to the entity's activities and its associates and produces a detailed, customized report of relevant activity affecting the entity.

    SUMMARY OF THE INVENTION

    The invention is directed to a system, method and computer program product for developing and interpreting e-commerce metrics that meets the identified needs. The method and computer program product involve collecting documents that are commonly transmitted over a computer network (e.g., the Internet, an institutional intranet, etc.), where the documents are relevant to the business operations of an entity. The method and computer program product also collect external data, which may or may not be available on the computer network, but that is highly relevant to the entity. A list of predetermined, entity-specific criteria is obtained from the external data. A list of rules is generated, where each rule contains at least one of the entity-specific criteria. The method and computer program product determine whether any of the collected pages satisfies any of the listed rules. Matching pages are gathered into a subset for further processing. Additional information is added to the subset of pages. The additional information can be contact information, routing tables, financial information, and other data which does not need to be collected more than once.

    The method and computer program product score the pages based on all the information collected to determine statistics. The statistics are analyzed for business information which may be important to the operations of the entity. The method and computer program product then produce a report to deliver a continuous stream of e-commerce intelligence for the entity. Depending on the entity-specific criteria, the method and computer program product can determine and report whether others are diverting entity's buyers or computer network traffic by using metatags and other browser magnets; selling or distributing the entity's goods without authorization; using or misusing the entity's intellectual property; claiming false affiliations with the entity; associating the entity with objectionable material, such as hate sites or other rogue sites, or with pornographic content; or engaging in other relevant activity affecting the entity or its goodwill. The method and computer program product can also be used to help identify potential partners, affiliates and other sources of unrealized revenue and to identify newsgroup commentary that may be impacting the entity's reputation and/or value.

    The e-commerce metrics system of the invention includes a downloader for searching a computer network (e.g., the Internet), a page processing module for receiving the pages downloaded from the search of the computer network, the page processing module forming a list of pages. In one embodiment, the system contains numerous downloaders for searching the entire computer network, searching specific locations, and searching specific formats (e.g., newsgroups or chat sites). The system also contains an archive for storing the listed pages, the pages being downloaded to the archive by the page processing module, and a database for allowing the page processing module to perform higher order operations on the pages on the list in order to produce a report to be utilized by users of the system. Entities use the system to search for information about themselves or other entities. In one embodiment, the system also includes a plurality of Internet clients (e.g., Web, e-mail, Wireless Application Protocol (WAP), etc.) that provide a graphical user interface (GUI) for users to enter search criteria, communicate with the downloader and page processing module, and view pages with scoring information, entity statistics, and page contents.

    One advantage of the invention is that users may quickly and efficiently search and find relevant information contained on Web, FTP, and File Service Protocol (F SP) sites, as well as chat rooms and newsgroups within the Internet.

    Another advantage of the invention is that detailed and customizable reports listing overall statistics and associated metrics are produced allowing entities to focus their business efforts.

    Another advantage of the invention is that its back-end (page processing module) and front-end (user interface) are designed to operate independently of each other, thus allowing greater throughput and availability of the system as a whole.

    Yet another advantage of the invention is that lists of relevant pages may be grouped and prioritized, both in an automated and manual fashion, in order to arrive at a manageable set of data.

    Further features and advantages of the invention as well as the structure and operation of various embodiments of the invention are described in de-tail below with reference to the accompanying drawings.

    BRIEF DESCRIPTION OF THE FIGURES

    The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art(s) to make and use the invention.

    In the drawings:

    FIG. 1A is a block diagram illustrating the system architecture of an embodiment of the invention, showing network connectivity among the various components;

    FIG. 1B is a block diagram illustrating the global Internet, showing the different components which may be present;

    FIG. 2 is a block diagram illustrating the software architecture of an embodiment of the invention, showing communications among the various components;

    FIG. 3 is a flowchart showing the overall operation of an embodiment of the invention;

    FIG. 4 is a block diagram illustrating the software architecture of a page processing module according to an embodiment of the invention;

    FIG. 5 is a flowchart showing the operation of scoring pages, according to an embodiment of the invention;

    FIGS. 6, 7, 8A and 8B are exemplary scoring input pages according to an embodiment of the invention;

    FIGS. 9 and 10 are exemplary output report pages according to an embodiment of the invention; and

    FIG. 11 is a block diagram of an exemplary computer system useful for implementing the invention.

    The invention will now be described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.

    DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

    Table of Contents

  • I. Overview
  • II. System Architecture
  • III. Software Architecture
  • IV. Overall E-Commerce Metrics System Operation
  • V. Graphical User Interface (Front-End)
  • VI. Page Processing Module (Back-End)
  • VII. Output Reports
  • VIII. Front-End and Back-End Severability
  • IX. Environment
  • X. Conclusion
    I. Overview


  • The present invention is directed to a system, method, and computer program product for developing and interpreting e-commerce metrics. In one embodiment of the invention, users are entities who are interested in maximizing their return on investment and e-commerce objectives with a continuous stream of relevant market feedback from the Internet. Such entities can employ an intelligent search engine that spans the entirety of the Internet's vast resources and returns links to Internet sites that, with a high probability of certainty, contain relevant information affecting the entity. The input of the system's search engine can be customized for each entity based on, for example, their products, services, business activity, and/or the types of intellectual property owned. The system's search engine can also provide detailed reports, customized to fit each entity's monitoring needs, so that the entity's personnel may prioritize their activities. In one embodiment, the system also provides a Web server so that entities may remotely utilize the search engine.

    While the invention is described in terms of the above example, this is for convenience only and is not intended to limit its application. In fact, after reading the following description, it will be apparent to one skilled in the relevant art(s) how to implement the following invention in alternative embodiments (e.g., providing online monitoring for a corporate intranet or extranet).

    Furthermore, while the following description focuses on the monitoring of Web sites, newsgroups, and FTP sites, and thus employs such terms as Universal Resource Locators (URLs), address, Web pages, and content, it is not intended to limit the application of the invention. It will be apparent to one skilled in the relevant art(s) based on the teachings contained herein how to implement the following invention, where appropriate, in alternative embodiments. For example, the invention may be applied to monitoring chat rooms, forums, or mailing lists, etc.

    II. System Architecture

    Referring to FIG. 1A, a block diagram illustrating the physical architecture of a e-commerce metrics system TOO, according to an embodiment of the invention, showing the network connectivity among the various components is shown. It should be understood that the particular e-commerce metrics system 100 in FIG. 1A is shown for illustrative purposes only and does not limit the invention. As will be apparent to one skilled in the relevant art(s) based at least on the teachings described herein, all of components "inside" (not shown) of the e-commerce metrics system 100 are connected directly or via to computer network 103.

    The e-commerce metrics system 100 includes a Web downloader 108 and news downloader 109. These downloaders are configured according to the nature of the pages that they search. The system includes a page processing module 110 that serves as the "back-end" of the invention. Page processing module 110 connects to the downloaders 108 and 109 to receive downloaded pages. Connected to the page processing module 110, is a database 120 and an archive 115. Page processing module 110 performs various counting and scoring operations on the downloaded pages and forwards the resulting metadata to database 120. Metadata includes various high order results from processing the data contained on collected pages. For example, the total number of pages containing links to a certain Web site, and/or an average of the number of external links on each Web page on a Web site. Complete copies of the pages are stored on archive 115.

    In one embodiment of the invention, directed page processing module 150 gathers pages from specific locations on computer network 103. Thus, directed page processing module 150 contains control logic similar to downloaders 108, 109 and page processing module 110, but only as necessary for limited (specific) page retrieval. Directed page processing module 150 forwards these pages to archive 115 after processing the information on the downloaded pages. Similarly, metadata generated from the downloaded pages is sent to database 120.

    Client/analyst Web server 125 provides clients 140 and analysts 130 with access to the metadata stored in database 120 and the pages stored in archive 1115. Analysts 130 are users of the invention who can review the metadata and pages and alter the focus of the searches conducted by the downloaders 108, 109, and directed page processing module 150. This feedback measure allows the invention to fully cover areas of the computer network 103 which contain desired information. Client Web server 135 is connected to archive 115. Client Web server 135 provides clients 140 with access to the stored pages used to develop metadata, which forms the bases for conclusions arrived at by the invention by the scoring processes of the present invention.

    As is well-known in the relevant art(s), a Web server is a server process running at a Web site which sends out Web pages in response to Hypertext Transfer Protocol (HTTP) requests from remote browsers. The Web servers 125 and 135 serve as "front ends" of the invention. That is, the Web servers 125 and 135 provide the graphical user interface (GUI) to users of the e-commerce metrics system 100 in the form of Web pages. Such users may access Web servers 125 and 135 either directly or via a connection to computer network 103 (e.g., the Internet).

    While only one database 120, archive 115, page processing module 110, and directed page processing module 150 are shown in FIG. 1A, it will be apparent to one skilled in the relevant art(s) that e-commerce metrics system 100 may be run in a distributed fashion over a plurality of the above-mentioned network elements connected via computer network 103. For example, both the page processing module 110 "back-end" application and the Web servers 125 and 135 "front-end" may be distributed over several computers thereby increasing the overall execution speed and/or reliability of the e-commerce metrics system 100. More detailed descriptions of the e-commerce metrics system 100 components, as well their functionality, are provided below.

    Referring to FIG. 11B, the global Internet depicted by computer network 103, includes a plurality of various FTP sites 104 (shown as sites 104a-n) and the WWW is shown. Within the WWW are a plurality of Web sites 106 (shown as sites 106a-n). The search space for the page processing module 1110 includes the Web sites 106 and the plurality of FTP sites 104. Within the Usenet are a plurality of newsgroups 105. As mentioned above, it will be apparent to one skilled in the relevant art(s), that the search space (i.e., computer network 103) of the e-commerce metrics system 100, although not shown, will also include chat rooms, mailing lists, FSP sites, etc.

    As will be apparent to one skilled in the relevant art(s), audio-visual content can be parsed for analysis by using technologies such as optical character recognition (OCR) and/or watermark technologies.

    III. Software Architecture

    Referring to FIG. 2, a block diagram illustrating a software architecture 200 according to an embodiment of e-commerce metrics system 100, showing communications among the various components, is shown. The software architecture 200 of e-commerce metrics system 100 includes software code that implements the page processing module 110 and directed page processing module 150 (hereinafter "processing modules 201I") in a high level programming language such as the C++ programming language. Further, in an embodiment, the processing modules 201 software code is an application running on an IBM™ (or compatible) personal computer (PC) in the Windows NT™ operating system environment.

    In one embodiment of the invention, the database 120 is implemented using a high-end relational database product (e.g., Microsoft™ SQL Server, IBM™ DB2, ORACLE™, INGRES™, etc.). As is well-known in the relevant art(s), relational databases allow the definition of data structures, storage and retrieval operations, and integrity constraints, where data and relations between them are organized in tables.

    In one embodiment of the invention, the processing modules 201 application communicates with the database 120 using the Open Database Connectivity (ODBC) interface. As is well-known in the relevant art(s), ODBC is a standard for accessing different database systems from high level programming language application. It enables these, applications to submit statements to ODBC using an ODBC structured query language (SQL) and then translates these to the particular SQL commands the underlying database product employs.

    The archive 115, in one embodiment of the invention, is any physical memory device that includes a storage media and a cache (e.g., the hard drive and primary cache, respectively, of the same PC that runs the page processing module 110 application). In an alternative embodiment, the archive 115 may be a memory device external to the PC hosting the processing modules 201 application. In yet another alternative embodiment, the archive 115 may encompass a storage media physically separate from the cache, where the storage media may also be distributed over several elements within connected to the computer network. Further, in one embodiment of the invention, the archive 1115 communicates with the processing modules 201 application and Web servers 125, and 135 using the operating system's native file commands (e.g., Windows NT™).

    The Web servers 125, and 135 provide the GUI "front-end" for e-commerce metrics system 100. In one embodiment of the invention, it is implemented using the Active Server Pages (ASP), Visual BASIC (TB) script, Extensible Mark-up Language (XML), and JavaScript™ sever-side scripting environments that allow the creation of dynamic Web pages. The Web servers 125 and 135 communicate with the plurality of clients 140 and analysts 130 (hereinafter, collectively shown as "users 202") using HTTP. The users 202 employ a browser (or other GUI) using Java, JavaScript™, and Dynamic Hypertext Markup Language (DHTML). In one embodiment, users can connect to e-commerce metrics system 100 via a WAP phone or facsimile machine. In an embodiment of the invention, as will be described in detail below in Section VIII, users 202 may also communicate directly with the processing modules 201 application via HTTP.

    IV E-Commerce Metrics System

    Referring to FIG. 3, a flowchart 300 showing the overall operation of the e-commerce metrics system 100, according to an embodiment of the invention, is shown. Flowchart 300 begins at step 302 with control passing immediately to both steps 304 and 310. Step 304 takes place in the directed page processing module 150. In step 304, a user defines a search criteria. The search criteria, as explained in detail below in Section V, are customized according to a particular user's concerns. In step 306, a search of the computer network 103 is performed. This search returns a list of probable uniform resource locators (URL's). As is well-known in the relevant-art(s), a URL is the standard for specifying the location of an object on the computer network 103. The URL standard addressing scheme is specified as "protocol://hostname" (e.g., "http://www.a_company.com", "ftp://organization/pub/files" or "news:alt.topic"). An URL beginning with "http" specifies a Web site 106, an URL beginning with "ftp" specifies an FTP site 104, and an URL beginning with "nntp" specifies a newsgroup. The probable URL's indicate a first (preliminary) set of locations (i.e., addresses) on the computer network 103, based on the search criteria, where pages containing information relevant to entity's operations may be found. The details of the search in step 306 are described in detail below in Section V.

    A separate process is also initiated from step 302. From step 302 control also immediately passes to step 310 in downloaders 108 and 109. The page searching and retrieval process is substantially similar as in steps 306-308. Step 310, however, does not work from a predetermined list of locations or address on computer network 103. Downloaders 108 and 109 download everything available on computer network 103. In step 312, the retrieved pages are filtered for information that is minimally relevant for users 202. Minimally relevant pages are downloaded to page processing module 110 in step 314.

    In steps 308 and 314, each of the URLs is visited and the contents downloaded locally to processing modules 201. The aim of the download steps 308 and 314 is so that subsequent processing steps of the e-commerce metrics system 100 may be performed on preserved copies of the visited URL's. This eliminates the need for re-visiting (and thus, re-establishing a connection to) each of the URLs Web sites 106, FTP sites 104, etc. specified by the URLs, thus increasing the overall performance of the e-commerce metrics system 100.

    If any of the URLs within the preliminary set contains files, those files may contain potentially relevant material (e.g., a "*.mp3" music file, or a "*.gif" or "*.jpg" image file). This is in contrast to actual text located on a Web page of a particular Web site 106. The files may be located: (1) on a different Web site 106 accessible via a hyperlink on the Web page the e-commerce metrics system 100 is currently accessing; (2) on a different Web page of the same Web site 106 the e-commerce metrics system 100 is currently accessing; or (3) in a different directory of the FTP site 104 than the e-commerce metrics system 100 is currently accessing. In these instances, the e-commerce metrics system 100 employs a Web crawling technique in order to locate the files.

    The Web crawling technique of the present invention discussed herein includes the use of URL address variations. After the original URL is visited and the link to the file is identified, the e-commerce metrics system 100 truncates the link URL at the rightmost slash ("/"), thus generating a new link URL. This process is repeated until a reachable domain is generated. This technique takes advantage of the fact that most designers of Web sites 106 allow "default" documents to be returned by their Web servers in response to such URL (via HTTP) requests. An example of the directed page processing module 150 and downloaders 108 and 109 Web crawling technique is shown in Table I below.

    TABLE 1
    EXAMPLE OF WEB CRAWLING TECHNIQUE
    Original Web Page URL:
    http://www.links-to-interesting-files-all-over-the-net.com
    Interesting Links Found on the Original Web Page Identified by Search
    Criteria:
    http://www.really-good-music-not-yet-released.com/future-hit.mp3
    ftp://www.company-trades-secrets.com/july/tradeseceret.doc
    Truncated URLs:
    http://www.really-good-music-not-yet-released.com/
    ftp://www.company-trades-secrets.com/july/
    ftp://www.company-trades-secrets.com/


    For any Web site 106 where the site's server is not currently responding (i.e., "down" or "off-line"), directed page processing module 150 and downloaders 108 and 109 applications, before removing the URL corresponding to the site from the preliminary set, implements a "re-try" timer and mechanism.

    When any of the URLs within the preliminary set is an FTP site 104 (or FSP site), the normal steps of visiting and downloading the sites are not practical and thus, not used. Therefore, the invention contemplates a method for "FTP crawling" in order to accomplish steps 308 and 314 for such URLs.

    First, the directed page processing module 150 and downloaders 108 and 109 applications attempt to log into the FTP site 104 specified by the URL. As is well known in the relevant art(s), there are two types of FTP sites 104—password protected sites and anonymous sites. If the site 104 is password protected and the password is not published in a reference linked page, it is passed over and the URL is removed from the preliminary set. If the FTP site 104 has a published password, the applications attempt to login using that password. If the FTP site 104 is an anonymous site, the applications attempt to log in. As is well known in the relevant art(s), an anonymous FTP site allows a user to login using a user name such as "ftp" or "anonymous" and then use their electronic mail address as the password.

    If a connection can be established, the applications have access to the directory hierarchy containing the publically accessible files (e.g., a "pub" subdirectory). The applications may then "nicely" crawl the relevant portions of the FTP site 104 by mapping the directory structure and then visiting certain directories based on keywords derived from the defined search criteria (steps 306 and 310).

    The purpose of nice FTP crawling is to capture the relevant contents of the FTP site 104 as it relates to the entity without burdening the host's resources by crawling the entire FTP site 104. This is especially important due the large size of a typical FTP site 104 (e.g., a university's site or someone entire PC hard disk drive), and due to the lack of crawl restriction standards like the "robots.txt" file commonly found on Web sites 106.

    Consider the example where the directed page processing module 150 and downloaders 108 and 109 are searching the for the directory: "ftp://ftp.stuff.com/˜user/music/famous_artist" in the context of a search for information related to an entity's music product. First, the nice FTP crawling technique involves establishing a single connection to the FTP site 104 (even if multiple content is needed from the site) and then going to the root directory. Second, a counter is then marked zero and a directory listing and snapshot of the current directory is taken. For each directory, if the directory name is "interesting," then the directed page processing module 150 and downloaders 108 and 109 enter the directory, set the counter to a positive number (e.g., C=2), then repeat the listing and snapshot step. If the counter is greater than zero or the directory is on the way to the destination directory, then the directory is entered and then the listing and snapshot step is repeated.

    To simulate human behavior, it is best if the directed page processing module 150 and downloaders 108 and 109 perform a depth first search, and introduce slight pauses between directory listings. "Interesting" directory listings are those containing terms related to the search criteria. For example, keywords for this search may include "songs," "sound," "album," "artist," "mp3," music_type, famous_artist, etc., and the destination directory (in the example, it can be "/famous-artist"), and other hard-coded directories that are usually of to interest (e.g., "/incoming").

    In an alternative embodiment, user 202 could also specify that uninteresting directories be crawled as well. The purpose of the counter (C) is to set the amount (depth) of sub-directories that the directed page processing module 150, as well as downloaders 108 and 109 will crawl in order to find "interesting" files. In one embodiment of the invention, to ease the burden on FTP site 104 servers, the total number of directories that can be crawled in a single FTP session may be limited.

    An example of the nice FTP crawling technique of the directed page processing module 150 and downloaders 108 and 109 are presented in Table 2 below. Table 2 illustrates a depth-first (from top to bottom) traversal of the directory structure of an FTP site 104.

    TABLE 2
    EXAMPLE OF NICE FTP CRAWLING TECHNIQUE
    ftp://ftp.stuff.com/
    ftp://ftp.stuff.com/~user
    ftp://ftp.stuff.com/~user/homework
    C ftp://ftp.stuff.com/~user/music
    C- ftp://ftp.stuff.com/~user/music/famous_artist1
    ...
    *C- ftp://ftp.stuff.com/~user/music/famous_artist
    ...
    C- ftp://ftp.stuff.com/~user/music/famous_artist2
    ...
    C- ftp://ftp.stuff.com/~user/music/famous_artist3
    ...
    ftp://ftp.stuff.com/~user/poetry
    ftp://ftp.stuff.com/~user2
    ftp://ftp.stuff.com/~user3
    C ftp://ftp.stuff.com/incoming
    ...
    C = directory judged to be "interesting" in context of the search and counter set to C
    C- = counter decremented at this level of the directory tree
    * = destination directory
    ... = the page processing module 110 crawls every subdirectory up to the depth of C under the directory

    The above-described "nice FTP crawling" allows users 202 to obtain reports with both the URL and contents of any interesting FTP site 104.

    For any FTP site 104 where the password failed, it is passed over and the URL is removed from the preliminary set. If the site's server is not currently responding (i.e., "down" or "off-line"), too many users were already logged in, or otherwise unavailable for connection, the directed page processing module 150 and downloaders 108 and 109 applications, before removing the URL corresponding to those sites from the preliminary set, implement a "re-try" timer and mechanism.

    In step 316, the locally downloaded pages are scored (i.e., ranked). The scoring of the individual pages is based on the inputs specified in the search criteria (step 304). Bach page is given a score based on a text search of keywords from the search criteria and statistics accumulated from analyzing the pages. The applications of processing modules 201 possess inference code logic that allows anything resident on a page or in the underlying HTML code (i.e., tags) that formats the page to be numerically weighted. The scoring may be based on the separate regions of the page such as the title or information within a tag (e.g., meta-tags, anchor tags, etc.). Also, scoring may be based on such information as the URL of the page itself, dimensions of pictures on the page, the presence of a specific picture file, the number of a certain type of file, length of sound files, watermarks, embedded source information, as well as information about a page provided by another page. During this process, the e-commerce metrics system 100 possesses logic to also recognize exact duplicates of an entity's graphics files (i.e., pictures, logos, etc.), without the need for digital water marking. This additional logic further contributes to the scoring process of step 316. The numbers, figures, and statistics generated by the scoring process is collectively referred to as metadata. Metadata is stored in database 120 in step 318.

    The scoring of pages may also involve whether any offending URLs contain advertising. This is useful information to clients because those sites are considered commercial and not fan or personal (i.e., non-commercial) sites. Advertisement recognition is accomplished by parsing an image located within an URL and capturing the alt text (alt text is an HTML attribute that displays a block of text as an alternative to an image, for text-based browsers. It is used inside the <IMG>tag; the format is <IMG SRC="URL" ALT="TEXT"), click-through URL, click-through resolved URL, and URL of the image. Then, if any of the following three rules are met, the e-commerce metrics system 100 identifies the probable presence of an advertisement: (1) the alt text or URL of the advertisement image contains keywords common, to those around known advertisements; (2) the click-through URL and the resolved click through URL specify different domains; or (3) the image is an exact match of a known advertisement.

    During this process, the e-commerce metrics system 100 develops a table of advertisement dimensions that are common to each Web site 106 encountered. Thus, in an alternative embodiment, a fourth rule is used to recognize advertisements. That is, if the dimensions of the image fit the tolerances of the dimensions in the table for a Web site 106, the image is probably an advertisement. The data, for the table of advertisement dimensions are kept in archive 115 and queried via the database 120. Accordingly, the score for each page is adjusted (i.e., increased) if the metrics system 100 identifies the presence of a probable advertisement.

    In step 320, a archive of the pages is done to the storage media of archive 115. In order to archive each Web page, the "inline" contents of the page must be separated from the non-inline contents. Inline contents include any text, sounds, and images found directly on the Web page and that automatically plays or is displayed when the page is browsed. In contrast, non-inline contents include the links that Web pages contain to other Web sites 106. In order to obtain a "self-sustaining" local copy of the Web page, only the inline contents of each Web page of the preliminary list of URLs is stored in archive 115. In an alternative embodiment, a client may want included in their final report (step 330 described below) properties or metrics associated with non-inline contents of relevant pages. Thus, in such an embodiment, step 320 can also include the non-inline contents of each Web page (i.e., a "complete" archive). In yet another embodiment, the system 100 in step 320 could generate a snapshot of the page and store this snapshot as a single graphical image.

    As indicated in FIG. 3, step 320 is optional. That is, a user may desire not to perform a complete archive (and thus, not create self-sustaining local copies of the Web pages. Thus, the operation of e-commerce metrics system 100 may proceed directly to step 322 after the pages are scored in step 316. In an alternative embodiment, step 320 may perform a summary archive where, for example, only the headers and/or titles of the pages is archived.

    In step 322, the preliminary set of URLs is grouped into "actual sites." Most people equate Web sites 106 with either domain names or host names. For example, a URL of "http://www.a_company.com" and all the pages under it are typically viewed as one Web site 106. However, as Web designers develop schemes to partition their sites among distinct users, they divide their-name space to create sub-sites. Examples are "community sites" which are companies or organizations that provide free homepages to individual consumers, and university servers that host student homepages. In these examples, each user or student with a homepage is an "actual site." For example, the directed page processing module 150 application may obtain a preliminary list (from step 306) of probable URLs containing the URLs shown in Table 3 below.

    TABLE 3
    PRELIMINARY LIST OF URLs
    http://www.university_with_many_students.edu/students/b/joe_smith/
    main.html
    http://www.university_with_many_students.edu/students/b/joe_smith/
    pics/me.jpg
    http://www.university_with_many_students.edu/students/c/jane_hacker/
    main.html
    .
    .
    .


    In the example of Table 3, the first two URLs are one actual site, whereas the third is a separate actual site. In one embodiment of the invention, the page processing module 150 application may recognize which URLs to group into one actual site based both on: (1) patterns such as ˜username, /students/?/<?>, /users/?/<?>, /homepages/?/<?>—where "?" is a single character wildcard and "<?>" is an optional single character wildcard; and (2) hard-coded rules for known sites which follow no discernable patterns (e.g., the GeoCitieS™ community site). The grouping step aids in arriving at a manageable but informative number of URLs that will be included in a user's final report. In one embodiment of the invention, the above-described grouping technique may be used, in conjunction with the score pages step 316, to present the user with the "best" (i.e., highest scoring) page within an actual site. This removes information clutter from the final report and further aids in arriving at a manageable number of URLs to report.

    In step 322, the e-commerce metrics system 100 groups pages into preliminary set(s) of URLs to be selected by users 202 in step 324. This optional human intervention step allows a second (refined and smaller) set of probable URLs to be defined, where likely infringements or disparagements of the entity's Internet Protocol occur. The selection step 324 is essentially a feedback option for expanding on the preliminary list of URLs. This refinement allows for more selectivity than what is produced from the search criteria (step 304) or general filtering (step 312).

    The e-commerce metrics system 100 automates the information gathering process in order to minimize the time required by human users and maximize their effectiveness. It is advisable, however, to have humans review and prioritize the set of probable URLs because no presently existing software has the ability to discern the intent of the use of content on a Web page. For example, the e-commerce metrics system may identify a page with an image of a famous professional athlete. The e-commerce metrics system, however, may not be able to identify whether the image is one where the athlete is pictured, without authorization, in his or her team uniform. Another example includes a page with a probable advertisement identified by the e-commerce metrics system 100 which is verified by a human user during step 324.

    In one embodiment of the invention, the directed page processing module 150 application allows several users to visit, prioritize, and add analysis data to the preliminary set of URLs. As a user on any of the plurality of workstations 130 or workstations 140 visits and prioritizes a Web site 106 corresponding to a URL on the preliminary list, it is marked so no duplication of effort occurs. Further, the e-commerce metrics system 100 is also capable of logging, for record keeping purposes, which user has analyzed a page including a time stamp of when the analysis took place.

    It should be noted that in alternative embodiments of the invention, the score pages step 316, full archive step 320, group pages step 322, and select groups step 324 may be performed in an order different than that presented herein without departing from the spirit and scope of the invention.

    In step 328, the e-commerce metrics system 100 obtains additional information for each URL in the second refined set. This additional information is used to provide contact, routing and other information which does not need to be repeatedly determined (e.g., via searching) or is expensive in terms of the time required to gather, the monetary cost, and/or other resources. In one embodiment, this configuration is a result of the time required to operate on a subset of pages. For instance, the e-commerce metrics system 100, in an automated fashion, obtains the contact information from the Internet. The sources for this information include the Network Information Center (InterNIC). As is well-known in the relevant art(s), InterNIC is a consortium originated by the National Science Foundation to coordinate information services, directory and database services, and registration services within the Internet (i.e., computer network 103).

    In step 330, a report is generated for the user. The report may be customized for a particular entity and typically includes the refined list of URLs, the contact information for each URL, the score for each URL, metadata provided by the processing modules 201, data provided by users of the e-commerce metrics system 100 (i.e., during step 324), as well as charts and graphs containing any metrics the user may request. Database 120 is utilized to query the archived metadata in generating reports, using the tables. Reports may relay information, for example, on how downloaded pages have changed over time. A more detailed description of output reports and examples are presented in Section VII below.

    In step 332, the user, using the report, may then take action in accordance with the information presented in the report. In one embodiment of the invention, the information contained in the output report may be used by the e-commerce metrics system 100 to be directly inputted into an entity's business model. For example, the output report may be used to automatically generate: (1) Cease and desist letters (customized for each entity) to each offending Web site 106 operator; (2) Re


    Free Web Sudoku Puzzles.
    Solve with your browser.
    5   3   4        
                  9 7
              2   6  
    9     1     3   2
    3               8
    2   1     7     6
      7   9          
    1 5              
            8   2   4
    What is it?



    Add Your Site · Terms Of Service · Privacy Policy


    DISCLAIMER
    Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

    For More Specific Information VIEW OUR TERMS OF SERVICE.

    Thank you and Enjoy!