ROBUST AND INVARIANT AUDIO PATTERN MATCHING

Share
Save

Share On Facebook Share On Twitter Share By Email
Save Item
Add to
my lists

Filing Information

  • Publication Number: WO2003091990
  • Application Number: US312126
  • Filing date: 04/18/2003
  • Publication date: 11/06/2003
Explore Your Innovation Network™ for an introduction to:

Innovation Network Your First Name:
Last Name:
 
Already a member? Sign In
  • U.S. Classifications: --
  • International Classifications: 7G 10L 21/00 A ·
  • Foreign Priority: US37605502 - 04/25/2002 ·
27 Claims, 8 Drawings


Abstract

The present invention provides an innovative technique for rapidly and accurately determining whether two audio samples match, as well as being immune to various kinds of transformations, such as playback speed variation. The relationship between the two audio samples is characterized by first matching certain fingerprint objects derived from the respective samples. A set (230) of fingerprint objects (231, 232), each occurring at a particular location (242), is generated for each audio sample (210). Each location is determined in dependence upon the content of respective audio sample (210) and each fingerprint object (232) characterizes one or more local features (222) at or near the respective particular location (242). A relative value is next determined for each pair of matched fingerprint objects. A histogram of the relative values is then generated. If a statistically significant peak is found, the two audio samples can be characterized as substantially matching.

References Cited

The current document has no citations.

Read Patent

Read patent

Independent Claims | See all claims (27)

  1. What is claimed is: 1. A method of characterizing a relationship between a first and a second audio samples, comprising the steps of: generating a first set of fingerprint objects for the first audio sample, each fingerprint object occurring at a respective location within the first audio sample, the respective location being determined in dependence upon the content of the first audio sample, and each fingerprint object characterising one or more features of the first audio sample at or near each respective location;
  2. generating a second set of fingerprint objects for the second audio sample, each fingerprint object occurring at a respective location within the second audio sample, the respective location being determined in dependence upon the content of the second audio sample, and each fingerprint object characterising one or more features of the second audio sample at or near each respective location;
  3. pairing fingerprint objects by matching a first fingerprint object from the first audio sample with a second fingerprint object from the second audio sample that is substantially similar to the first fingerprint object;
  4. generating, based on the pairing step, a list of pairs of matched fingerprint objects;
  5. determining a relative value for each pair of matched fingerprint objects;
  6. generating a histogram of the relative values; and
  7. searching for a statistically significant peak in the histogram, the peak characterizing the relationship between the first and second audio samples.
  8. detecting if the relative pitch and a reciprocal of the relative playback speed are substantially different, in which case the relationship between the first and second audio samples is characterized as nonlinear.
  9. for each pair of matched fingerprint objects in the list, determining a compensated relative time offset value, t-R*t, where t and t are locations in time with respect to the first and second fingerprint objects;
  10. generating a second histogram of the compensated relative time offset values; and
  11. searching for a statistically significant peak in the second histogram of the compensated relative time offset values, the peak further characterizing the relationship between the first and second audio samples.
  12. 17. A computer program product for performing a method according to any preceding • claim.