[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*Subject*: Re: Avoiding a for cicle*From*: "J.D. Smith" <jdsmith(at)astro.cornell.edu>*Date*: Wed, 12 Apr 2000 13:59:13 -0400*Newsgroups*: comp.lang.idl-pvwave*Organization*: Cornell University*References*: <B51278FB.4C3D%zamb@physics.ucla.edu> <38EE2E2F.D5849FB3@astro.cornell.edu> <38F3210B.F2A82B04@mpipsykl.mpg.de> <onu2h8k7mx.fsf@cow.physics.wisc.edu> <38F34796.DD3D1BEF@astro.cornell.edu> <38F3B146.FCD65D71@astro.cornell.edu> <onr9cbjq6p.fsf@cow.physics.wisc.edu>*Sender*: verified_for_usenet(at)cornell.edu (jts11 on vodka.tn.cornell.edu)*Xref*: news.doit.wisc.edu comp.lang.idl-pvwave:19228

Craig Markwardt wrote: > > "J.D. Smith" <jdsmith@astro.cornell.edu> writes: > > > > Alright code slingers... new challenge... find location of all peaks in a region > > > of n points (n odd), monotonically decreasing away from the peak. I.e. find > > > peaks of width n. > > > > > > > Since noone will take my challenge I'm forced to claim the prize for myself: > > > > wh=where(d gt ((m=median(d,3))) and smooth((d eq m)*(n-2),n-2) eq n-3) > > Slow down there whipper-snapper! Gotta let these things stew for a > while. Though multiline, this is my best effort: > > dd = d(1:*)-d > nh = (n-1)/2 > wh = where(convol((dd GT 0) AND (dd(nh:*) LT 0), bytarr(nh)+1, nh) EQ 1, ct)+1 > > For the goobledy-gook impaired (aka DF :-), > dd is the first difference of the data > nh is the half-width of the peak > (dd GT 0) AND (dd(nh:*) LT 0) locates up-going followed by down-going points > convol(...) locates runs of length nh > > This one does exactly what was requested, which I'm not sure of about > your solution, J.D. On the other hand, your solution may be more > physically meaningful since it involves smoothing. > Nice entry Craig. But unfortunatly it doesn't *alway* do exactly what was requested. It works fine for n=5, but for n>5 (7,9,...), the index is off. Part of the reason is in the way convol works. For n=5, nh=2, and convol subscripts are left of center (t+i-1 for i=0,1). For n=7, nh=3, and convol subscripts are (t+i-1 for i=0,1,2), which is centered. For n=9, nh=4, and you again have (t+i-2 for i=0,1,2,3), left of center again... and so on. The additional complication is that you are finding the center of the *wrapped* up-going and down-going entries... the center of the peak is offset from this by (nh+1)/2 (which happens to equal 1 when n=5 or n=3). Adding this rather than 1 to the returned indices would make it correct. Here's an example for n=9 (^=up-goin, v=down-going): | | ^^^^vvvv | | You want the second marked location, the first down-going point. Convol will instead return the first marked location, so you have to offset by the half-width with the convol centering taken into account. Offsetting by 1 as you have done delivers the point between the two marked. Now I'll explain my solution, which does indeed produce indentical results to yours (after being fixed as above, and except for the null case -- yours always returns (nh-1)/2, instead of -1, though you could test the count to avoid this). It only works for n>=5 (but for n=3 you can use the first half of it as posted originally -- I left out the test on n which would select between them). Let's examine the parts: d gt ((m=median(d,3))) This is the solution to the original n=3 case, modified to assign m to the median 3 result. "d" is the data in which you are finding peaks, and n is the odd peak width you are trying to find. smooth((d eq m)*(n-2),n-2) eq n-3 This looks like it must be smoothing the data somehow. It is not. Consider first the term (d eq m). Just as (d gt m) indicates a local 3-point maximum, (d eq m) indicates a local 3 point line. I.e., the point has one neighbor greater and one neighbor less than it. It lives on a "slope", not on a peak (or valley). So, what is the definition of an n point peak? It has a central point which is a 3-pt peak, surrounded on either side by n-3 points which are 3-pt slopes. Here is a picture demonstrating this for an n=7 peak: * ^ * * / \ * * / \ * * The first is the data, the second indicates the kind of data, peak or slope. So, now we can consider the full smooth() expression. Smooth(), as you know, takes rolling averages. But averages are really just sums. We would like a way to find all 3-pt peaks, with n-3 3-pt "slopes" surrounding it. We can do a moving count of such slopes with smooth! (This is a trick I use all the time... boxcar counting.) To avoid integer truncation of the average, we first multiply the boolean vector of slopes by (n-2). E.g. if you had 11011 in the above n=7 example of (d eq m), you'd smooth 55055 instead, with width 5, to find the value 4 at the central position. This is just the number we wanted! So, putting it all together, we want a single 3-pt peak, with n-3 3-pt slopes around it. We've seen how to find peaks and slopes, and we can count slopes with smooth. So we have all the ingredients. "Aha," you might say,"but I have found the critical flaw in your reasoning... (d eq m) detects not a single kind of slope, but two kinds -- both / and \. To ensure a peak, you need only /'s before and only \'s after the peak! Any other combination is not an n point peak!" A seemingly bullet-proof argument. But think carefully... If we know a point is a 3 point peak, an adjacent slope to the left is necessarily a /, and an adjacent slope to the right is necessarily a \, since part of the slope includes the peak itself! Moving two away from the peak, any slope adjacent to a \ slope is itself a \ slope, by a similar argument, and any slope adjacent to a / is itself a / slope. You can't switch slopes without going through a valley or a peak! So this test is sufficient to find them all. To try this on some data, just generate some random numbers: IDL> d=randomu(sd,1000) IDL> n=7 IDL> wh=where(d gt ((m=median(d,3))) and smooth((d eq m)*(n-2),n-2) eq n-3) To find all 7 point peaks in a stream of random numbers. There will be a few. You can examine them either one at a time, or by constructing a flag array to print side by side. Here's an example of the data for one peak: IDL> print,wh 38 344 412 706 906 IDL> print,rotate(d[wh[0]-n/2:wh[0]+n/2],1) 0.271930 0.404679 0.460323 0.787722 0.432555 0.411609 0.0525138 You can see the central peak at 0.787722. The numbers of such peaks in random data decreases rapidly with n... try finding a 13 point peak... you'll need 10 million points before you have a good chance of getting some! As far as real applications, you would probably not use this for spectral lines, which often don't follow the strict rules of monotonically decreasing from a central value. Instead you'd convolve with a gaussian kernel of appropriate width, or do something similar. But I'm sure there are instances when finding "real" peaks is necessary. JD -- J.D. Smith |*| WORK: (607) 255-5842 Cornell University Dept. of Astronomy |*| (607) 255-6263 304 Space Sciences Bldg. |*| FAX: (607) 255-5875 Ithaca, NY 14853 |*|

**Follow-Ups**:**Re: Avoiding a for cicle***From:*Craig Markwardt

**References**:**Avoiding a for cicle***From:*Ricardo Fonseca

**Re: Avoiding a for cicle***From:*J.D. Smith

**Re: Avoiding a for cicle***From:*Benno Puetz

**Re: Avoiding a for cicle***From:*Craig Markwardt

**Re: Avoiding a for cicle***From:*J.D. Smith

**Re: Avoiding a for cicle***From:*J.D. Smith

**Re: Avoiding a for cicle***From:*Craig Markwardt

- Prev by Date:
**Re: Avoiding a for cicle** - Next by Date:
**Re: Avoiding a for cicle** - Prev by thread:
**Re: Avoiding a for cicle** - Next by thread:
**Re: Avoiding a for cicle** - Index(es):