Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Discover How to Choose Your Advertising Promotional Pen Today
Category:
Business  

Bitacle Blog Search Archive
Category:
Marketing  

Conversion to Crack
Category:
Health / Fitness  

The Fate of Astrology and Psychics in the Cruel World of Skeptic...
Category:
Business  

Acne Treatment What Natural Methods That Actually Works With You...
Category:
Health / Fitness  

FBI Raids Pertinent or Paranoid
Category:
Business  

What Are Different Types Of Blood Pressure Monitor
Category:
Health / Fitness  

Quit Smoking
Category:
Business  

Get Some Exercise
Category:
Health / Fitness  

The Acne Sufferers Bible
Category:
Health / Fitness  

Cinnamon Helps Diabetics
Category:
Health / Fitness  

Creating Multiple Streams of Income
Category:
Marketing  

Hot Affiliate Tracking Software
Category:
Marketing  

Using The Let Them Feel The Pain Promotional Model
Category:
Marketing  

The resource box to makes people click
Category:
Marketing  

Revivogen experience
Category:
Health / Fitness  

Best Motorcycle Eyewear is not just Coincidence
Category:
Business  

Your Allergy Action Self Treatment Plan
Category:
Health / Fitness  

The Ten Most Important Skin Care Tips
Category:
Health / Fitness  

Flight information
Category:
Business  

5 Easy Ways to Lower Blood Pressure Using 1 Great Fruit
Category:
Health / Fitness  

The Effects of Anxiety Problems on Your Life
Category:
Health / Fitness  

How To Compare Credit Cards
Category:
Business  

Insurance investing in all that matters
Category:
Finance / Investment  

What does health insurance cover
Category:
Finance / Investment  

Sugar Busters Lose Weight
Category:
Health / Fitness  

5 Things to Know about Credit Card Rewards Programs
Category:
Finance / Investment  

How to Become Super Rich
Category:
Self Help  

What Do You Know about Drug Testing
Category:
Health / Fitness  

5 Hidden Ways Super Affiliates Use Google to Boost Sales
Category:
Marketing  

The Growing Trend of Planning a Wedding Online
Category:
Self Help  

Attention deficit hyperactivity disorder ADHD has a simple cause...
Category:
Health / Fitness  

Borrow Money From Those You Know In Order To Start Up Your Inter...
Category:
Marketing  

Movement Sufficiency NOT Calorie Deficiency
Category:
Health / Fitness  

How memory foam mattresses scores above coil spring mattresses
Category:
Health / Fitness  

cereals and their preparation
Category:
Health / Fitness  

Are You AT Aware
Category:
Health / Fitness  

Mens Wide Shoes Your Personal Guide
Category:
Home And Family  

You Can Control Symptoms of a Panic Attacks
Category:
Health / Fitness  

Give a man six inches and he ll want a
Category:
Health / Fitness  

How to lose weight naturally and effectively
Category:
Health / Fitness  

Secured Loans Arrange Cash Without Hassles
Category:
Finance / Investment  

characteristics that Contribute to Work Stress and Burnout
Category:
Business  

Nifty Networking Tips
Category:
Business  

PROBLEMS TODAY WITH AFFILIATE MATRIX PROGRAMS
Category:
Marketing  

Payday Loans Helpful Tips With Fast Cash For Financial Issues
Category:
Finance / Investment  

Crib Safety Guidelines for New Parents
Category:
Business  

Amortization Schedule What Do Those Numbers Mean
Category:
Finance / Investment  

How to Choose the Best Online Casino for You
Category:
Hobbies / Pastimes  

College Kids and Credit Cards
Category:
Business  

Advertising Balloon With Helium Laugh It Up
Category:
Marketing  

How You Can Get Search Engines TOP Rankings the Easy Way
Category:
Marketing  

The Best Kept Secret of Email Marketers
Category:
Marketing  

Rebrandable PDF s A Viral Traffic Generation Mega Tool
Category:
Marketing  

Credit Cards For Small Business
Category:
Business  

The Important Function of Safes
Category:
Home And Family  

Shark Cage Diving In South Africa
Category:
Travel  

Phones How To Find The Cheapest Phone Calling Card
Category:
Business  

Credit Counseling Explained
Category:
Business  

Irritable Bowel Syndrome and Your Diet
Category:
Health / Fitness  

Oily skin care
Category:
Health / Fitness  

In Los Angeles Lasik Vision Correction Varies Widely in Price
Category:
Health / Fitness  

Six Power Secrets of Marketing Yourself in Today s Economy Part ...
Category:
Business  

How To Make Sure You Have A Safe Flight
Category:
Finance / Investment  

Gout Symptoms
Category:
Health / Fitness  

Are Non Profit Credit Counseling Agencies a Better Bet for Consu...
Category:
Finance / Investment  

Business systems what are they
Category:
Business  

Treating Adult Female Acne
Category:
Health / Fitness  

MLM Training The Single Biggest Mistake Made in Network Marketin...
Category:
Business  

Starting Up a Business with Poster Banner Prints
Category:
Business  

Rainbow Trout in the UK
Category:
Sports  

A Look at Youth Baseball Bats
Category:
Sports  

A great proven ab workout
Category:
Health / Fitness  

Weight Loss Psychology
Category:
Health / Fitness  

Billboard Poster Printing Its Methods and Developmental Processe...
Category:
Business

System and method for pre-processing input data to a support vector machine Number:7,020,642 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: System and method for pre-processing input data to a support vector machine

Abstract: A system and method for preprocessing input data to a support vector machine (SVM). The SVM is a system model having parameters that define the representation of the system being modeled, and operates in two modes: run-time and training. A data preprocessor preprocesses received data in accordance with predetermined preprocessing parameters, and outputs preprocessed data. The data preprocessor includes an input buffer for receiving and storing the input data. The input data may be on different time scales. A time merge device determines a desired time scale and reconciles the input data so that all of the input data are placed on the desired time scale. An output device outputs the reconciled data from the time merge device as preprocessed data. The reconciled data may be input to the SVM in training mode to train the SVM, and/or in run-time mode to generate control parameters and/or predictive output information.

Patent Number: 7,020,642 Issued on 03/28/2006 to Ferguson,   et al.


Inventors: Ferguson; Bruce (Round Rock, TX); Hartman; Eric (Austin, TX)
Assignee: Pavilion Technologies, Inc. (Austin, TX)
Appl. No.: 051574
Filed: January 18, 2002

Current U.S. Class: 706/21; 700/53
Current Intern'l Class: G06F 15/18 (20060101); G06E 1/00 (20060101)
Field of Search: 706/21,16 700/53,31,47,52


References Cited [Referenced By]

U.S. Patent Documents
5479573Dec., 1995Keeler et al.
5729661Mar., 1998Keeler et al.
5842189Nov., 1998Keeler et al.
6128608Oct., 2000Barnhill.
6157921Dec., 2000Barnhill.
6161130Dec., 2000Horvitz et al.
6427141Jul., 2002Barnhill.


Other References

Michael E. Tipping, Sparse Bayesian Learning and Relevance Vector Machine, Microsoft Research, 2001.
Karypis, George et al. "Fast Supervised dimensionality Reduction Algorithm with Applications to Document Categorization & Retrieval", ACM Proceedings of the ninth international conference on Information and Knowledge Management, 2000, pp. 12-19.
Sebastiani, Fabrizio "Machine Learning in Automated Text Categorization", ACM Computing Surveys, vol. 34, No. 1, Mar. 2002, pp. 1-47.
International Search Report, Application No. PCT/US 03/01582, mailed Apr. 10, 2003.

Primary Examiner: Knight; Anthony
Assistant Examiner: Holmes; Michael B.
Attorney, Agent or Firm: Meyertons Hood Kivlin Kowert & Goetzel, P.C., Hood; Jeffrey C., Williams; Mark S.

Claims



What is claimed is:

1. A system for preprocessing input data for a support vector machine comprising:

a support vector machine, wherein the support vector machine comprises multiple inputs, and wherein each input is associated with a respective portion of input data;

an input buffer for receiving and storing the input data, the input data associated with at least two of the inputs being on different time scales relative to each other;

a time merge device for selecting a predetermined time scale and reconciling the input data stored in the input buffer such that all of the input data for all of the inputs are on the same time scale; and

an output device for outputting the data reconciled by the time merge device as reconciled data, said reconciled data comprising the input data to the support vector machine;

wherein the support vector machine is operable to receive the reconciled data as input data to the multiple inputs, and to generate output data in accordance with the reconciled data.

2. The data preprocessor of claim 1, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained;

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common time scale; and

wherein the support vector machine is operable to be trained according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

3. The data preprocessor of claim 1, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system;

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data; and

wherein the support vector machine is operable to receive said reconciled run-time data and generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

4. The data preprocessor of claim 3, wherein said control parameters are usable to determine control inputs to said system for run-time operation of said system.

5. The data preprocessor of claim 1, wherein the input data associated with at least one of the inputs has missing data in an associated time sequence and said time merge device is operable to reconcile said input data to fill in said missing data.

6. The data preprocessor of claim 1, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein said time merge device is operable to reconcile said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said at least one of the inputs having an associated time sequence based on said second time interval.

7. The data preprocessor of claim 1, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and wherein the input data associated with a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein said time merge device is operable to reconcile said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to a time scale based on a third time interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated time sequence based on said third time interval.

8. The data preprocessor of claim 1, wherein the input data associated with a first one or more of the inputs is asynchronous, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated time sequence based on a time interval; and

wherein said time merge device is operable to reconcile said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, wherein said reconciled input data comprise synchronous input data having an associated time sequence based on said time interval.

9. The data preprocessor of claim 1, wherein said input buffer is controllable to arrange the input data in a predetermined format.

10. The data preprocessor of claim 9, wherein the input data, prior to being arranged in said predetermined format, has a predetermined time reference for all data, such that each piece of input data has associated therewith a time value relative to said predetermined time reference.

11. The data preprocessor of claim 1, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

12. The data preprocessor of claim 1, further comprising:

a pre-time merge processor for applying a predetermined algorithm to the input data received by said input buffer prior to input to said time merge device.

13. The data preprocessor of claim 12, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

14. The data preprocessor of claim 12, further comprising:

an input device for selecting said predetermined algorithm from a group of available algorithms.

15. The data preprocessor of claim 1, wherein said output device further comprises a post-time merge processor for applying a predetermined algorithm to the data reconciled by said time merge device prior to output as said reconciled data.

16. The data preprocessor of claim 15, further comprising:

an input device for selecting said predetermined algorithm from a group of available algorithms.

17. The data preprocessor of claim 1, wherein the input data comprise a plurality of variables, each of the variables comprising an input variable with an associated set of data wherein each of said variables comprises an input to said input buffer; and

wherein each of at least a subset of said variables comprises a corresponding one of the inputs to the support vector machine.

18. The data preprocessor of claim 17, further comprising: a delay device for receiving reconciled data associated with a select one of said input variables and introducing a predetermined mount of delay to said reconciled data to output a delayed input variable and associated set of delayed input reconciled data.

19. The data preprocessor of claim 18, wherein said predetermined amount of delay is a function of an external variable, the data preprocessor further comprising:

means for varying said predetermined amount of delay as a function of said external variable.

20. The data preprocessor of claim 18, further comprising:

means for learning said predetermined delay as a function of training parameters generated by a system modeled by the support vector machine.

21. The data preprocessor of claim 1, further comprising:

a graphical user interface (GUI) which is operable to receive user input specifying one or more data manipulation and/or reconciliation operations to be performed on said input data.

22. The data preprocessor of claim 21, wherein said GUI is further operable to display said input data prior to and after performing said manipulation and/or reconciliation operations on said input data.

23. The data preprocessor of claim 21, wherein said GUI is further operable to receive user input specifying a portion of said input data for said data manipulation and/or reconciliation operations.

24. A system for preprocessing input data for a support vector machine comprising:

a support vector machine, wherein the support vector machine comprises multiple inputs, and wherein each input is associated with a respective portion of input data;

an input buffer for receiving and storing the input data, the input data associated with at least two of the inputs being on different independent variable scales relative to each other;

a merge device for selecting a predetermined independent variable scale and reconciling the input data stored in the input buffer such that all of the input data for all of the inputs are on the same independent variable scale; and

an output device for outputting the data reconciled by the merge device as reconciled data, said reconciled data comprising the input data to the support vector machine;

wherein the support vector machine is operable to receive the reconciled data as input data to the multiple inputs, and to generate output data in accordance with the reconciled data.

25. The data preprocessor of claim 24, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained;

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common independent variable scale; and

wherein the support vector machine is operable to be trained according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

26. The data preprocessor of claim 24, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system;

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data; and

wherein the support vector machine is operable to receive said reconciled run-time data and generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

27. The data preprocessor of claim 26, wherein the input data associated with at least one of the inputs has missing data in an associated independent variable sequence; and

wherein said merge device is operable to reconcile said input data to fill in said missing data.

28. The data preprocessor of claim 24, wherein the input data associated with a first one or more of the inputs has an associated independent variable sequence based on a first interval, and a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said merge device is operable to reconcile said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs having an associated independent variable sequence based on said second interval.

29. The data preprocessor of claim 24, wherein a first one or more of the inputs has an associated independent variable sequence based on a first interval, and wherein the input data associated with a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said merge device is operable to reconcile said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to an independent variable scale based on a third interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated independent variable sequence based on said third interval.

30. The data preprocessor of claim 24, wherein the input data associated with a first one or more of the inputs is asynchronous with respect to an independent variable, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated independent variable sequence based on an interval; and

wherein said merge device is operable to reconcile said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, and wherein said reconciled input data comprise synchronous input data having an associated independent variable sequence based on said interval.

31. A method for preprocessing input data prior to input to a support vector machine having multiple inputs, each of the inputs associated with a portion of the input data, the method comprising:

receiving and storing the input data, the input data associated with at least two of the inputs being on different time scales relative to each other;

time merging the input data for the inputs such that all of the input data are reconciled to the same time scale;

outputting the reconciled time merged data as reconciled data, the reconciled data comprising the input data to the support vector machine;

providing the reconciled data as input data to the multiple inputs of the support vector machine; and

the support vector machine generating output data in accordance with the reconciled data.

32. The method of claim 31, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained; and

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common time scale;

the method further comprising:

training the support vector machine according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

33. The method of claim 31, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system; and

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data;

the method further comprising:

inputting said reconciled run-time data into the support vector machine to generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

34. The method of claim 33, wherein said control parameters are usable to determine control inputs to said system for run-time operation of said system.

35. The method of claim 31, wherein the input data associated with at least one of the inputs has missing data in an associated time sequence; and

wherein said time merging comprise:

reconciling said input data to fill in said missing data.

36. The method of claim 31, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein said time merging comprise:

reconciling said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said at least one of the inputs having an associated time sequence based on said second time interval.

37. The method of claim 31, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and wherein the input data associated with a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein said time merging comprise:

reconciling said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to a time scale based on a third time interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated time sequence based on said third time interval.

38. The method of claim 31, wherein the input data associated with a first one or more of the inputs is asynchronous, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated time sequence based on a time interval; and

wherein said time merging comprise:

reconciling said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, wherein said reconciled input data comprise synchronous input data having an associated time sequence based on said time interval.

39. The method of claim 31, wherein said receiving and storing the input data comprise:

arranging the input data in a predetermined format.

40. The method of claim 39, wherein, prior to said arranging in said predetermined format, the input data has a predetermined time reference for all data, such that each piece of input data has associated therewith a time value relative to said predetermined time reference.

41. The method of claim 31, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

42. The method of claim 31, further comprising:

applying a predetermined algorithm to the input data received by said input buffer prior to said time merging.

43. The method of claim 42, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

44. The method of claim 42, further comprising:

selecting said predetermined algorithm from a group of available algorithms.

45. The method of claim 31, further comprising:

applying a predetermined algorithm to the reconciled time merged data prior to outputting said reconciled time merged data.

46. The method of claim 45, further comprising:

an input device for selecting said predetermined algorithm from a group of available algorithms.

47. The method of claim 31, wherein the input data comprise a plurality of variables, each of the variables comprising an input variable with an associated set of data wherein each of said variables comprises an input to said input buffer; and

wherein each of at least a subset of said variables comprises a corresponding one of the inputs to the support vector machine.

48. The method of claim 47, further comprising:

receiving reconciled data associated with a select one of said input variables; and

introducing a predetermined mount of delay to said reconciled data to output a delayed input variable and associated set of delayed reconciled input data.

49. The method of claim 48, wherein said predetermined amount of delay is a function of an external variable, the method further comprising:

varying said predetermined amount of delay as a function of said external variable.

50. The method of claim 48, further comprising:

learning said predetermined delay as a function of training parameters generated by a system modeled by the support vector machine.

51. The method of claim 31, further comprising:

a graphical user interface (GUI) receiving user input specifying one or more data manipulation and/or reconciliation operations to be performed on said input data.

52. The method of claim 51, further comprising:

the GUI displaying said input data prior to and after performing said manipulation and/or reconciliation operations on said input data.

53. The method of claim 51, further comprising:

the GUI receiving user input specifying a portion of said input data for said data manipulation and/or reconciliation operations.

54. A method for preprocessing input data for a support vector machine having multiple inputs, each of the inputs associated with a portion of the input data, comprising:

receiving and storing the input data, the input data associated with at least two of the inputs being on different independent variable scales relative to each other;

reconciling the input data stored in the input buffer such that all of the input data for all of the inputs are on the same independent variable scale to generate reconciled data; and

outputting reconciled data, said reconciled data comprising the input data to the support vector machine;

providing the reconciled data as input data to the multiple inputs of the support vector machine; and

the support vector machine generating output data in accordance with the reconciled data.

55. The method of claim 54, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained; and

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common independent variable scale;

the method further comprising:

training the support vector machine according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

56. The method of claim 54, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system; and

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data;

the method further comprising:

inputting said reconciled run-time data into the support vector machine to generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

57. The method of claim 56, wherein the input data associated with at least one of the inputs has missing data in an associated independent variable sequence; and

wherein said merging comprises:

reconciling said input data to fill in said missing data.

58. The method of claim 54, wherein the input data associated with a first one or more of the inputs has an associated independent variable sequence based on a first interval, and a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said merging comprises:

reconciling said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs having an associated independent variable sequence based on said second interval.

59. The method of claim 54, wherein a first one or more of the inputs has an associated independent variable sequence based on a first interval, and wherein the input data associated with a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said merging comprises:

reconciling said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to an independent variable scale based on a third interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated independent variable sequence based on said third interval.

60. The method of claim 54, wherein the input data associated with a first one or more of the inputs is asynchronous with respect to an independent variable, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated independent variable sequence based on an interval; and

wherein said merging comprises:

reconciling said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, and wherein said reconciled input data comprise synchronous input data having an associated independent variable sequence based on said interval.

61. A system for preprocessing input data for a support vector machine comprising:

a support vector machine, wherein the support vector machine comprises multiple inputs, and wherein each input is associated with a respective portion of input data;

means for receiving and storing the input data, the input data associated with at least two of the inputs being on different independent variable scales relative to each other;

means for reconciling the input data stored in the input buffer such that all of the input data for all of the inputs are on the same independent variable scale to generate reconciled data; and

means for outputting reconciled data, said reconciled data comprising the input data to the support vector machines;

wherein the support vector machine is operable to receive the reconciled data as input data to the multiple inputs, and to generate output data in accordance with the reconciled data.

62. The system of claim 61, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained; and

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common independent variable scale;

the system further comprising:

means for training the support vector machine according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

63. The system of claim 61, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system; and

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data;

the system further comprising:

means for inputting said reconciled run-time data into the support vector machine to generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

64. The system of claim 63, wherein the input data associated with at least one of the inputs has missing data in an associated independent variable sequence; and

wherein said means for merging comprises:

means for reconciling said input data to fill in said missing data.

65. The system of claim 61, wherein the input data associated with a first one or more of the inputs has an associated independent variable sequence based on a first interval, and a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said means for merging comprises:

means for reconciling said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs having an associated independent variable sequence based on said second interval.

66. The system of claim 61, wherein a first one or more of the inputs has an associated independent variable sequence based on a first interval, and wherein the input data associated with a second one or more of the inputs has an associated independent variable sequence based on a second interval; and

wherein said means for merging comprises:

means for reconciling said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to an independent variable scale based on a third interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated independent variable sequence based on said third interval.

67. The system of claim 61, wherein the input data associated with a first one or more of the inputs is asynchronous with respect to an independent variable, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated independent variable sequence based on an interval; and

wherein said means for merging comprises:

means for reconciling said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, and wherein said reconciled input data comprise synchronous input data having an associated independent variable sequence based on said interval.

68. A memory medium which stores program instructions for preprocessing input data prior to input to a support vector machine having multiple inputs, each of the inputs associated with a portion of the input data, wherein said program instructions are executable to:

receive and store the input data, wherein the input data associated with at least two of the inputs are on different time scales relative to each other;

time merge the input data for the inputs such that all of the input data are reconciled to the same time scale; and

output the reconciled time merged data as reconciled data, the reconciled data comprising the input data to the support vector machines;

provide the reconciled data as input data to the multiple inputs of the support vector machine.

69. The memory medium of claim 68, wherein the support vector machine comprise a non-linear model having a set of model parameters defining a representation of a system, said model parameters capable of being trained; and

wherein the input data comprise training data including target input data and target output data, wherein said reconciled data comprise reconciled training data including reconciled target input data and reconciled target output data, and wherein said reconciled target input data and reconciled target output data are both based on a common time scale;

wherein said program instructions are further executable to:

train the support vector machine according to a predetermined training algorithm applied to said reconciled target input data and said reconciled target output data to develop model parameter values such that said support vector machine has stored therein a representation of the system that generated the target output data in response to the target input data.

70. The memory medium of claim 68, wherein the support vector machine comprises a non-linear model having a set of model parameters defining a representation of a system, wherein said model parameters of said support vector machine have been trained to represent said system; and

wherein the input data comprise run-time data, and wherein said reconciled data comprise reconciled run-time data;

wherein said program instructions are further executable to:

input said reconciled run-time data into the support vector machine to generate run-time output data, wherein said run-time output data comprise one or both of control parameters for said system and predictive output information for said system.

71. The memory medium of claim 70, wherein said control parameters are usable to determine control inputs to said system for run-time operation of said system.

72. The memory medium of claim 68, wherein the input data associated with at least one of the inputs has missing data in an associated time sequence; and

wherein in performing said time merging said program instructions are further executable to:

reconcile said input data to fill in said missing data.

73. The memory medium of claim 68, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein in performing said time merging said program instructions are further executable to:

reconcile said input data associated with said first one or more of the inputs to said input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said at least one of the inputs having an associated time sequence based on said second time interval.

74. The memory medium of claim 68, wherein the input data associated with a first one or more of the inputs has an associated time sequence based on a first time interval, and wherein the input data associated with a second one or more of the inputs has an associated time sequence based on a second time interval; and

wherein in perfonning said time merging said program instructions are further executable to:

reconcile said input data associated with said first one or more of the inputs and said input data associated with said second one or more of the inputs to a time scale based on a third time interval, thereby generating reconciled input data associated with said first one or more of the inputs and said second one or more of the inputs having an associated time sequence based on said third time interval.

75. The memory medium of claim 68, wherein the input data associated with a first one or more of the inputs is asynchronous, and wherein the input data associated with a second one or more of the inputs is synchronous with an associated time sequence based on a time interval; and

wherein in performing said time merging said program instructions are further executable to:

reconcile said asynchronous input data associated with said first one or more of the inputs to said synchronous input data associated with said second one or more of the inputs, thereby generating reconciled input data associated with said first one or more of the inputs, wherein said reconciled input data comprise synchronous input data having an associated time sequence based on said time interval.

76. The memory medium of claim 68, wherein in performing said receiving and storing said program instructions are further executable to:

arrange the input data in a predetermined format.

77. The memory medium of claim 76, wherein, prior to said arranging in said predetermined format, the input data has a predetermined time reference for all data, such that each piece of input data has associated therewith a time value relative to said predetermined time reference.

78. The memory medium of claim 68, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

79. The memory medium of claim 68, wherein said program instructions are further executable to:

apply a predetermined algorithm to the input data prior to said performing said time merging.

80. The memory medium of claim 79, wherein each piece of data has associated therewith a time value corresponding to the time the input data was generated.

81. The memory medium of claim 79, wherein said program instructions are further executable to:

select said predetermined algorithm from a group of available algorithms.

82. The memory medium of claim 68, wherein said program instructions are further executable to:

apply a predetermined algorithm to the reconciled time merged data prior to outputting said reconciled time merged data.

83. The memory medium of claim 82, wherein said program instructions are further executable to:

select said predetermined algorithm from a group of available algorithms.

84. The memory medium of claim 68, wherein the input data comprise a plurality of variables, each of the variables comprising an input variable with an associated set of data wherein each of said variables comprises an input to said input buffer; and

wherein each of at least a subset of said variables comprises a corresponding one of the inputs to the support vector machine.

85. The memory medium of claim 84, wherein said program instructions are further executable to:

receive reconciled data associated with a select one of said input variables; and

introduce a predetermined mount of delay to said reconciled data and output a delayed input variable and associated set of delayed reconciled input data.

86. The memory medium of claim 85, wherein said predetermined amount of delay is a function of an external variable, wherein said program instructions are further executable to:

vary said predetermined amount of delay as a function of said external variable.

87. The memory medium of claim 85, wherein said program instructions are further executable to:

learn said predetermined delay as a function of training parameters generated by a system modeled by the support vector machine.

88. The memory medium of claim 68, wherein said program instructions are further executable to present a graphical user interface (GUI), wherein said GUI is operable to receive user input specifying one or more data manipulation and/or reconciliation operations to be performed on said input data.

89. The memory medium of claim 88, wherein said GUI is further operable to display said input data prior to and after performing said manipulation and/or reconciliation operations on said input data.

90. The memory medium of claim 88, wherein said GUI is further operable to receive user input specifying a portion of said input data for said data manipulation and/or reconciliation operations.
Description



BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to the field of predictive system models. More particularly, the present invention relates to preprocessing of input data so as to correct for different time scales, transforms, missing or bad data, and/or time-delays prior to input to a support vector machine for either training of the support vector machine or operation of the support vector machine.

2. Description of the Related Art

Many predictive systems may be characterized by the use of an internal model which represents a process or system for which predictions are made. Predictive model types may be linear, non-linear, stochastic, or analytical, among others. However, for complex phenomena non-linear models may generally be preferred due to their ability to capture non-linear dependencies among various attributes of the phenomena. Examples of non-linear models may include neural networks and support vector machines (SVMs).

Generally, a model is trained with training data, e.g., historical data, in order to reflect salient attributes and behaviors of the phenomena being modeled. In the training process, sets of training data may be provided as inputs to the model, and the model output may be compared to corresponding sets of desired outputs. The resulting error is often used to adjust weights or coefficients in the model until the model generates the correct output (within some error margin) for each set of training data. The model is considered to be in "training mode" during this process. After training, the model may receive real-world data as inputs, and provide predictive output information which may be used to control the process or system or make decisions regarding the modeled phenomena. It is desirable to allow for pre-processing of input data of predictive models (e.g., non-linear models, including neural networks and support vector machines), particularly in the field of e-commerce.

Predictive models may be used for analysis, control, and decision making in many areas, including electronic commerce (i.e., e-commerce), e-marketplaces, financial (e.g., stocks and/or bonds) markets and systems, data analysis, data mining, process measurement, optimization (e.g., optimized decision making, real-time optimization), quality control, as well as any other field or domain where predictive or classification models may be useful and where the object being modeled may be expressed abstractly. For example, quality control in commerce is increasingly important. The control and reproducibility of quality is be the focus of many efforts. For example, in Europe, quality is the focus of the ISO (International Standards Organization, Geneva, Switzerland) 9000 standards. These rigorous standards provide for quality assurance in production, installation, final inspection, and testing of processes. They also provide guidelines for quality assurance between a supplier and customer.

A common problem that is encountered in training support vector machines for prediction, forecasting, pattern recognition, sensor validation and/or processing problems is that some of the training/testing patterns may be missing, corrupted, and/or incomplete. Prior systems merely discarded data with the result that some areas of the input space may not have been covered during training of the support vector machine. For example, if the support vector machine is utilized to learn the behavior of a chemical plant as a function of the historical sensor and control settings, these sensor readings are typically sampled electronically, entered by hand from gauge readings, and/or entered by hand from laboratory results. It is a common occurrence in real-world problems that some or all of these readings may be missing at a given time. It is also common that the various values may be sampled on different time intervals. Additionally, any one value may be "bad" in the sense that after the value is entered, it may be determined by some method that a data item was, in fact, incorrect. Hence, if a given set of data has missing values, and that given set of data is plotted in a table, the result may be a partially filled-in table with intermittent missing data or "holes". These "holes" may correspond to "bad" data or "missing" data.

Conventional support vector machine training and testing methods require complete patterns such that they are required to discard patterns with missing or bad data. The deletion of the bad data in this manner is an inefficient method for training a support vector machine. For example, suppose that a support vector machine has ten inputs and ten outputs, and also suppose that one of the inputs or outputs happens to be missing at the desired time for fifty percent or more of the training patterns. Conventional methods would discard these patterns, leading to no training for those patterns during the training mode and no reliable predicted output during the run mode. The predicted output corresponding to those certain areas may be somewhat ambiguous and/or erroneous. In some situations, there may be as much as a 50% reduction in the overall data after screening bad or missing data. Additionally, experimental results have shown that support vector machine testing performance generally increases with more training data, therefore throwing away bad or incomplete data may decrease the overall performance of the support vector machine.

Another common issue concerning input data for support vector machines relates to situations when the data are retrieved on different time scales. As used herein, the term "time scale" is meant to refer to any aspect of the time-dependency of data. As is well known in the art, input data to a support vector machine is generally required to share the same time scale to be useful. This constraint applies to data sets used to train a support vector machine, i.e., input to the SVM in training mode, and to data sets used as input for run-time operation of a support vector machine, e.g., input to the SVM in run-time mode. Additionally, the time scale of the training data generally must be the same as that of the run-time input data to insure that the SVM behavior in run-time mode corresponds to the trained behavior learned in training mode.

In one example of input data (for training and/or operation) with differing time scales, one set of data may be taken on an hourly basis and another set of data taken on a quarter hour (i.e., every fifteen minutes) basis. In this case, for three out of every four data records on the quarter hour basis there will be no corresponding data from the hourly set. Thus, the two data sets are differently synchronous, i.e., have different time scales.

As another example of different time scales for input data sets, in one data set the data sample periods may be non-periodic, producing asynchronous data, while another data set may be periodic or synchronous, e.g., hourly. These two data sets may not be useful together as input to the SVM while their time-dependencies, i.e., their time scales, differ. In another example of data sets with differing time scales, one data set may have a "hole" in the data, as described above, compared to another set, i.e., some data may be missing on one of the data sets. The presence of the hole may be considered to be an asynchronous or anomalous time interval in the data set, and thus may be considered to have an asynchronous or inhomogeneous time scale.

In yet another example of different time scales for input data sets, two data sets may have two different respective time scales, e.g., an hourly basis and a 15 minute basis. The desired time scale for input data to the SVM may have a third basis, e.g., daily.

While the issues above have been described with respect to time-dependent data, i.e., where the independent variable of the data is time, t, these same issues may arise with different independent variables. In other words, instead of data being dependent upon time, e.g., D(t), the data may be dependent upon some other variable, e.g., D(x).

In addition to data retrieved over different time periods, data may also be taken on different machines in different locations with different operating systems and quite different data formats. It is essential to be able to read all of these different data formats, keeping track of the data values and the timestamps of the data, and to store both the data values and the timestamps for future use. It is a formidable task to retrieve these data, keeping track of the timestamp information, and to read it into an internal data format (e.g., a spreadsheet) so that the data may be time merged.

Inherent delays in a system is another issue which may affect the use of time-dependent data. For example, in a chemical processing system, a flow meter output may provide data at time t0 at a given value. However, a given change in flow resulting in a different reading on the flow meter may not affect the output for a predetermined delay τ. In order to predict the output, this flow meter output must be input to the support vector machine at a delay equal to τ. This must also be accounted for in the training of the support vector machine. Thus, the timeline of the data must be reconciled with the timeline of the process. In generating data that account for time delays, it has been postulated that it may be possible to generate a table of data that comprises both original data and delayed data. This may necessitate a significant amount of storage in order to store all of the delayed data and all of the original data, wherein only the delayed data are utilized. Further, in order to change the value of the delay, an entirely new set of input data must be generated from the original set.

Thus, improved systems and methods for preprocessing data for training and/or operating a support vector machine are desired.

SUMMARY OF THE INVENTION

A system and method are presented for preprocessing input data to a non-linear predictive system model based on a support vector machine. The system model may utilize a support vector machine having a set of parameters associated therewith that define the representation of the system being modeled. The support vector machine may have multiple inputs, each of the inputs associated with a portion of the input data. The support vector machine parameters may be operable to be trained on a set of training data that is received from training data and/or a run-time system such that the system model is trained to represent the run-time system. The input data may include a set of target output data representing the output of the system and a set of measured input data representing the system variables. The target data and system variables may be reconciled by the preprocessor and then input to the support vector machine. A training device may be operable to train the support vector machine according to a predetermined training algorithm such that the values of the support vector machine parameters are changed until the support vector machine comprises a stored representation of the run-time system. Note that as used herein, the term "device" may refer to a software program, a hardware device, and/or a combination of the two.

In one embodiment of the present invention, the system may include a data storage device for storing training data from the run-time system. The support vector machine may operate in two modes, a run-time mode and a training mode. In the run-time mode, run-time data may be received from the run-time system. Similarly, in the training mode, data may be retrieved from the data storage device, the training data being both training input data and training output data. A data preprocessor may be provided for preprocessing received (i.e., input) data in accordance with predetermined preprocessing parameters to output preprocessed data. The data preprocessor may include an input buffer for receiving and storing the input data. The input data may be on different time scales. A time merge device may be operable to select a predetermined time scale and reconcile the input data so that all of the input data are placed on the same time scale. An output device may output the reconciled data from the time merge device as preprocessed data. The reconciled data may be used as input data to the system model, i.e., the support vector machine. In other embodiments, other scales than time scales may be determined for the data, and reconciled as described herein.

The support vector machine may have an input for receiving the preprocessed data, and may map it to an output through a stored representation of the run-time system in accordance with associated model parameters. A control device may control the data preprocessor to operate in either training mode or run-time mode. In the training mode, the preprocessor may be operable to process the stored training data and output preprocessed training data. A training device may be operable to train the support vector machine (in the training mode) on the training data in accordance with a predetermined training algorithm to define the model parameters on which the support vector machine operates. In the run-time mode, the preprocessor may be operable to preprocess run-time data received from the run-time system to output preprocessed run-time data. The support vector machine may then operate in the run-time mode, receiving the preprocessed input run-time data and generating a predicted output and/or control parameters for the run-time system.

The data preprocessor may further include a pre-time merge processor for applying one or more predetermined algorithms to the received data prior to input to the time merge device. A post-time merge processor (e.g., part of the output device) may be provided for applying one or more predetermined algorithms to the data output by the time merge device prior to output as the processed data. The preprocessed data may then have selective delay applied thereto prior to input to the support vector machine in both the run-time mode and the training mode. The one or more predetermined algorithms may be externally input and stored in a preprocessor memory such that the sequence in which the predetermined algorithms are applied is also stored.

In one embodiment, the input data associated with at least one of the inputs of the support vector machine may have missing data in an associated time sequence. The time merge device may be operable to reconcile the input data to fill in the missing data.

In one embodiment, the input data associated with a first one or more of the inputs may have an associated time sequence based on a first time interval, and a second one or more of the inputs may have an associated time sequence based on a second time interval. The time merge device may be operable to reconcile the input data associated with the first one or more of the inputs to the input data associated with the seco


Free Web Sudoku Puzzles.
Solve with your browser.
  9 5 2 3       7
4 7              
2               6
9       4        
6   4 8   2 9   1
        6       5
5               2
              1 4
1       9 5 3 6  
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!