;+
; NAME:
; FLORMAT
;
; AUTHOR:
; Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770
; Craig.Markwardt@nasa.gov
;
; PURPOSE:
; Format a string with named format variables
;
; CALLING SEQUENCE:
; RESULT = FLORMAT(FORMAT, [ struct ], [x=x, y=y, ...], [_EXTRA=struct])
;
; DESCRIPTION:
;
; The function FLORMAT is used to easily insert a set of named
; parameters into a string using simple format codes. The key point
; is that format strings use *named* parameters instead of the
; position in the string.
;
; FLORMAT makes it easy to make maintainable and understandable
; format codes. FLORMAT is a convenience routine, which will be most
; suitable for formatting tabular output, but can be used for any
; complicated string formatting job where the positional parameters
; of STRING() become hard to manage. Users of Python will recognize
; FLORMAT as implementing "string interpolation."
;
; The user passes a format string similar to the IDL printf-style
; format string (i.e. using modified "%" notation), and a set of
; named fields either by passing a structure, keywords, or both. The
; output strings are composed by inserting the named fields into the
; format string with any requested formatting.
;
; The function FLORMAT is equivalent to the STRING(...,FORMAT=fmt)
; method of formatting a string, where the format string is allowed
; to have the name of the variable.
;
; Let us consider an example of formatting a time with hours, minutes
; and seconds into a string as HH:MM:SS. One could use FLORMAT()
; like this,
;
; result = flormat('%(hour)02d:%(min)02d:%(sec)02d', $
; hour=hour, min=min, sec=sec)
;
; The variables HOUR, MIN and SEC are allowed to be scalars or
; vectors. The key point here is that the format string contains the
; *named* keyword variables (or structure entries). Unlike STRING(),
; the actual variables can be passed in any order, since the format
; string itself describes in what order the values will be assembled.
; This is similar to string interpolation in Python.
;
; The same variable can appear multiple times in the format string,
; but the user only need to specify that variable once. For example,
;
; result = flormat('Download %(href)s', $
; href='filename.txt')
;
; Note that HREF appears twice in the format string.
;
; INPUT VARIABLES:
;
; FLORMAT() allows you to pass in the values as named keywords as
; shown above, where the keyword values are arrays, or by passing in
; an array of structures. A similar example to the one above is,
;
; S = replicate({hour: 0, min: 0, sec: 0}, 100)
; ; ... fill the structure S with 100 time values ...
; result = flormat('%(hour)02d:%(min)02d:%(sec)02d', s)
;
; In this case S is an array of structures, and the result will be an
; array of strings with the same number of elements as S.
;
; Compare this with standard IDL where a FOR-loop is required, no
; repetition is permitted, and it is difficult to see which format
; code corresponds to which variable. For example,
;
; for i = 0, n_elements(hour)-1 do begin
; result(i) = string(hour(i), min(i), sec(i), $
; format='(%"%02d:%02d:%02d")')
;
; The input structure STRUCT may be an array of structures or a
; structure of arrays. It is also possible pass *both* a structure
; STRUCT and keywords. The important thing is that the each keyword
; and each STRUCT.FIELD must evaluate to the same number of
; elements. If they don't, then the smallest number of elements is
; used.
;
; PRINTF-STYLE FORMAT CODES
;
; FLORMAT() uses format codes in either C printf-style format codes
; (the default), or a new "$" shell-style syntax if /SHELL_STYLE$ is
; set.
;
; FLORMAT() assumes that by default the C printf-style format codes
; are passed. FLORMAT() uses a slightly short-hand notation for
; print-style format codes which saves some space and is more
; flexible.
;
; Standard printf-style format codes are of the form,
; FORMAT='(%"...format here...")' ;; Standard IDL
; The FLORMAT printf-style format codes simply dispense with the
; redundant parentheses and percent symbol,
; FORMAT='...format here...' ;; FLORMAT notation
; This notation improves the readability of the format string, since
; only the actual format string needs to be present. Also, this
; notation does not embed one set of quotation marks within another,
; as the standard IDL notation does, so format strings with quotation
; marks will be easier to compose.
;
; Standard IDL format codes look like this,
; %s - string
; %d - integer
; %04d - integer zero-padded to 4 spaces, etc
;
; The new FLORMAT format strings look like this,
;
; %(name)s - string based on variable named NAME
; %(value)d - integer based on variable named VALUE
; %(index)04d - integer based on variable named INDEX,
; zero-padded to 4 spaces
;
; As you can see, the only difference is the addition of the variable
; name in parenthesis. These names are looked up in the input
; keywords and/or structure passed to FLORMAT().
;
; SHELL-STYLE FORMAT CODES
;
; Shell style "$" is a convenience notation when strict formatting is
; less important. Shell-style "$" format strings will be signaled by
; setting the SHELL_STYLE$ keyword. Note the trailing dollar-sign
; '$'. The format coes will look like this,
;
; $name - variable named NAME will be placed here
; $value - variable named VALUE will be placed here, etc.
;
; This is exactly how Unix shell string interpolation works.
; Variables are substituted into place using their "natural" format
; code, based on the variable type.
;
; result = flormat('Download $href', /shell_style$, $
; href='filename.txt')
;
; Note that quotation marks still need to be escaped as \", just the
; same as calling STRING() or PRINT with a %-style format string.
;
; CAVEATS:
;
; FLORMAT() is a convenience routine meant mostly to improve the
; readability and maintainability of format codes. FLORMAT() is not
; meant for high performance applications. It spends time parsing
; the input format string. It also spends memory building up a
; temporary output structure. However, for most applications such as
; constructing tables of up to thousands of entries, FLORMAT() should
; be perfectly adequate.
;
; The name "FLORMAT" is a play on the words "floor-mat" and "format."
; The "L" in FLORMAT can be thought of standing for "long-form" IDL
; format codes.
;
; PARAMETERS:
;
; FORMAT - format string used to
;
; STRUCT - input structure containing named entries. This should
; either be an array of structures, with each field
; containing a scalar; or, a structure where each field
; contains an array with the same number of elements.
;
; RETURNS:
;
; The resulting formatted strings. The return value will be an
; array of strings containing the same number of elements as passed
; as input.
;
; KEYWORD PARAMETERS:
;
; SHELL_STYLE$ - if set, then the format string is a shell-style
; string.
;
; All named keywords are available to be used as named formats in
; your format code. Values may be either scalar, or vector.
; Vectors dimensions must match the dimensions of STRUCT (if
; STRUCT is passed).
;
; EXAMPLE:
;
;
; ; Additional examples appear above.
;
; SEE ALSO:
;
; STRING, Format codes, C print-style format codes
;
; MODIFICATION HISTORY:
; Written, CM, 14 Sep 2009
; Finalized and documented, CM, 08 Dec 2011
;
; $Id: flormat.pro,v 1.9 2013/03/16 23:29:40 cmarkwar Exp $
;
;-
; Copyright (C) 2011, Craig Markwardt
; This software is provided as is without any warranty whatsoever.
; Permission to use, copy, modify, and distribute modified or
; unmodified copies is granted, provided this copyright and disclaimer
; are included unchanged.
;-
pro flormat_structcheck, s, n, tn
COMPILE_OPT strictarr
tn = ''
if n_elements(s) EQ 0 then return
tp = size(s,/type)
if tp NE 8 then message, 'ERROR: input variable must be a structure'
if n_elements(n) EQ 0 then n = n_elements(s.(0))
tn = tag_names(s)
nt = n_elements(tn)
for i = 1, nt-1 do begin
n = n < n_elements(s.(i))
endfor
return
end
function flormat, format0, s0, _EXTRA=extra, shell_style$=shell, $
format_am_pm=am_pm, format_days_of_week=days_of_week, $
format_months=months
COMPILE_OPT strictarr
if n_params() LT 1 AND n_elements(extra) EQ 0 then begin
USAGE:
message, 'USAGE: string = FLORMAT(FORMAT, struct) or', /info
message, ' string = FLORMAT(FORMAT, x=x, y=y, ...) or', /info
message, ' string = FLORMAT(FORMAT, _EXTRA=struct)', /info
return, ''
endif
;; FORMAT must be a scalar
tp = size(format0,/type)
if tp NE 7 OR n_elements(format0) GT 1 then begin
message, 'ERROR: FORMAT must be a scalar format string'
endif
fmt = format0[0]
if n_elements(s0) EQ 0 AND n_elements(extra) EQ 0 then begin
message, 'ERROR: you must either specify a structure or keywords'
endif
;; Do data-checking and also compute the total number of elements
flormat_structcheck, s0, n, tn0
flormat_structcheck, extra, n, tn1
;; Decide on whether it is a (%"") C-style or ($"") shell type format string
fmt_type = '%'
if keyword_set(shell) then fmt_type = '$'
;; Regular expression for %(varname) or $varname splitting
;; Example: "blah blah %(varname) blah blah"
;; splits to "blah blah " and " "
;; Example: "blah blah $varname blah blah"
;; splits to "blah blah " " blah blah"
;;
regex = '%\([^)]*\)' ;; %(varname)
if fmt_type EQ '$' then begin
regex = '('+regex+'|\$[a-zA-Z_][a-zA-Z0-9_]*)' ;; or $varname
endif
spos = strsplit(fmt, regex, /regex, /preserve_null, length=slen)
ninterp = n_elements(spos)-1
;; No special format codes requested, so return immediately
if ninterp EQ 0 then return, fmt
;; Separate the format string into the "surrounding" string data (FMTS)
;; and the interpolation data (FMTI).
fmts = strmid(fmt, spos, slen)
ipos = spos+slen
ilen = spos[1:*] - ipos
ipos = ipos[0:ninterp-1]
case fmt_type of
'%': begin ;; %(NAME) -> NAME
ipos = ipos + 2
ilen = ilen - 3
end
'$': begin ;; $NAME -> NAME
ipos = ipos + 1
ilen = ilen - 1
end
endcase
fmti = strmid(fmt, ipos, ilen)
for i = 0, ninterp-1 do begin
varname = fmti[i]
;; Check structure
wh = where(strupcase(varname) EQ tn0, ct)
if ct GT 0 then begin
wh = wh[0]
;; Example kind of this field
exemplar = s0[0].(wh)
srci = 0L
endif else begin
;; Check EXTRA
wh = where(strupcase(varname) EQ tn1, ct)
if ct GT 0 then begin
wh = wh[0]
;; Example kind of this field
exemplar = (extra.(wh))[0]
srci = 1L
endif else begin
message, 'ERROR: tag name "'+varname+'" does not exist in input structure'
endelse
endelse
tp = size(exemplar, /type)
dims = size(exemplar, /dimensions)
code = '' ;; Default: code is already in format str
;; If the user put $varname then we must decide on an output format
if fmt_type EQ '$' then begin
case tp of
1: code = 'd' ;; BYTE
2: code = 'd' ;; INT
3: code = 'd' ;; LONG
4: code = 'g' ;; FLOAT
5: code = 'g' ;; DOUBLE
7: code = 's' ;; STRING
12: code = 'd' ;; UINT
13: code = 'd' ;; ULONG
14: code = 'd' ;; ULONG64
15: code = 'd' ;; LONG64
else: message, string(varname, $
format='("ERROR: $",A0," must of real, integer or string type")')
endcase
endif
;; New tag name
tni = string(i, format='("N",I0)')
if n_elements(news) EQ 0 then begin
news = create_struct(tni, exemplar)
imap = [wh]
isrc = [srci]
endif else begin
news = create_struct(news, tni, exemplar)
imap = [imap, wh]
isrc = [isrc, srci]
endelse
;; Add %-style format code to appropriate string
fmts[i] = fmts[i] + '%'+code
endfor
;; Transfer the data to the new output structure
outs = replicate(news, n)
for i = 0, n_elements(imap)-1 do begin
if isrc[i] EQ 0 then begin
outs.(i) = s0.(imap[i])
endif else begin
outs.(i) = extra.(imap[i])
endelse
endfor
ofmt = strjoin(fmts)
;; ;; Replace '"' by '\"' (poor man's REPSTR)
;; ofmts = strsplit(ofmt, '"', /preserve_null, /extract)
;; nquote = n_elements(ofmts)-1
;; if nquote GT 0 then begin
;; ofmts[0:nquote-1] = ofmts[0:nquote-1] + '\"'
;; ofmt = strjoin(ofmts)
;; endif
ofmt1 = '(%"'+ofmt+'")'
return, string(outs, format=ofmt1, $
am_pm=am_pm, days_of_week=days_of_week, months=months)
end