DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.

четверг, 7 октября 2010 г.

[bash][part][draft]Return values from bash function

DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.

UPD. 2010.09.07
If we need to return several values from function, we can return through 
    - either setting some global variables,
    - or through pipe ("stdout").
    - or use both.
      
1. Global variables.

    If we do not want to hardcode these global variable names into script, we
    can pass their names as arguments and then use `eval` to assign values in
    child function,

        f() {
            ..
            eval "$1=\"a    b\""
        }
        f g1
        declare -p g1

    but in this case there is possible name conflicts between local function's
    variables and global ones. If local variable has the same name as global,
    it replaces it and we no longer able to set global variable in this
    child function.
    
2. Pipe.

    If we use pipe, we'll first gather all values from local child function's
    variables and send them into pipe, but then, in caller function, we'll
    parse pipe content to separate these values again and assign to proper
    variables. This is waste. Also, order, in which results will be outputed
    into pipe, should be fixed since cutting of result into separate values
    again and assigning them to proper variables into caller function assumes
    certain order.  And if this order sometimes accidently will change,
    returned values will be messed.

        f3() {
            local l1='a   b'
            local l2='   c d'
            echo "\"$l1\""
            echo "\"$l2\""
            return 0
        }
        eval "$(echo "$(f3)" | sed -e'1s/^/g1=/;2s/^/g2=/')"
        declare -p g1 g2

3. Mixed.

    But we can use both those techniques at the same time to eliminate all
    listed above problems each of them have.

    Names of global variables to be set should be passed through arguments,
    but in the pipe we'll send not raw values, but assignment statements with
    already expanded values, and in caller function we should simply execute
    them (assignment statements) in `eval`.  In such case we get:
        - no name conflicts, since we're no longer need global variables into
          child function: we need only their names to generate proper output,
          but values to them will be assigned into `eval` in caller function.
        - no parsing of pipe by caller, since child function knows names of
          variables to which caller want to assign values, and send the output
          with already prepared assignment statements (values in this
          assignment statements must be expanded in child function, since
          child's local variables will be deleted upon it returns).
        - Order of variables in child function's output may be any (except,
          very special cases, where one variable use the value of previous
          ones), since all assignments already written and no parsing
          required.

        f() {
            local l1='local   1'
            local l2='local 2'
            echo "g1=\"$l1\""
            echo "g2=\"$l2\""
            return 0
        }
        declare g1=''
        declare g2=''
        eval "$(f g1 g2)"
        declare -p g1 g2

Note1:
    - when function is called through command substitution, then it'll be
      executed in the subshell (unlike, when it's called normally, it is
      executed in the main shell), so you can _not_ set any parent shell's
      variable inside it. Thus, all parent shell's variable, which should be
      set must be written into pipe in assignment statements form.
    - you can not reinitialize array in parent shell using 'arr=( ${arr[@]} )'
      syntax, when writing to pipe, if array elements contain shell
      'metacharacters' (not IFS characters!). This is because in `eval`-ed
      script all values will be already expanded (yet before it'll be written
      into pipe) and will not contain any quotes, so they'll be broken into
      tokens (words and operators) by shell 'metacharacters' far before any
      expansions (including word splitting by 'IFS' characters) will take
      place.
    - the only way to reinitialize array in parent shell is to write either
      'arr[index]=value' for each element or `declare -p arr` into pipe.

Here is example.

# ./t.sh {{{

#!/bin/bash

f() {
    echo "SUBSH=$BASH_SUBSHELL" >/dev/tty
    arr1[3]='    g h  '
    arr2[3]='i    k  l'
    IFS=" "
    echo "arr1=( ${arr1[*]} )"
    echo "$(declare -p arr2)"
    return 0
}
declare -a arr1=(
    'a   b'
    'c      '
)
declare -a arr2=(
    '    d'
    '   e  f   '
)
eval "$(f)"
declare -p arr1 arr2

# }}}

Note2:

Even if we don't need any results from function, which write result into pipe,
following considerations should be looked at:
    - such function should always be invoked in subshell, even if we don't
      need its results. Otherwise, it'll be executed in the same shell as
      parent function and its actions with FDs may result in unexpected FD
      table state for parent.
    - FD=1 of such function should always be connected to something like
      '/dev/null'. Otherwise, it can write results somewhere you don't expect
      them for.

Example-1.

Here both functions f() and f2() return result through pipe, but function f()
does not need what f2() returns, so f2() is called (before f() moves pipe)
without subshell and without pipe connected. This results in the following
problems:
    - f2() closes 'fd_t_stdout' in the parent function's FD table, though f()
      thinks it's still open. This resluts in "9: Bad file descriptor error",
      when f() tries to move pipe and f() loses correct value for stdout.
      Hence, f() writes into pipe both output intended for 'stdout' and
      intended for 'pipe'.
    - f2() writes its output intended for 'pipe' into the _same_ pipe, as f()
      uses, because it was invoked before f() moves its own pipe.

# cat ./t.sh {{{
#!/bin/bash

declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"

f2() {
    echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to stdout"

    echo "f2(): restore pipe" >>"$log_file"
    exec 1>&$fd_t_pipe-
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to pipe"
}
f() {
    echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    f2
    echo "f(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): write to stdout"

    echo "f(): restore pipe" >>"$log_file"
    eval "exec 1>&$fd_t_pipe-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"

# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12681, BASH=12683, SUBSH=1
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w  FIFO    0,6        31967 pipe
t.sh    12683 root    2w   REG    8,7   43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12683 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12681, BASH=12683, SUBSH=1
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w  FIFO    0,6        31967 pipe
t.sh    12683 root    2w   REG    8,7  438 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12683 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12683 root    2w   REG    8,7  805 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12683 root    7w  FIFO    0,6        31967 pipe
f2(): restore pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w  FIFO    0,6        31967 pipe
t.sh    12683 root    2w   REG    8,7 1175 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): move pipe
./t.sh: line 36: 9: Bad file descriptor
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w  FIFO    0,6        31967 pipe
t.sh    12683 root    2w   REG    8,7 1492 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): restore pipe
./t.sh: line 43: 7: Bad file descriptor
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12683 root    0r   CHR    1,5          920 /dev/zero
t.sh    12683 root    1w  FIFO    0,6        31967 pipe
t.sh    12683 root    2w   REG    8,7 1812 1590776 /home/sgf/new_tree/src/send_sms/2.tmp

# }}}
# cat ./1.tmp {{{
f2(): write to stdout
v: 'f2(): write to pipe
f(): write to stdout
f(): write to pipe'

# }}}

Example-2.

If f() moves its pipe before calling f2(), but calls it also without subshell,
this will not be better:
    - f2() will write its output intended for 'pipe' into 'stdout', since
      stdout instead of pipe will be opened at FD=1, when f2() starts.
    - f2() replaces FD='fd_t_pipe' value into parent function's FD table, so
      saved f()'s pipe will be replaced with 'stdout' (since 'stdout' will be
      at FD=1 in f2()). So, f() loses correct value for its pipe and writes
      into 'stdout' both output intended for pipe and intended for 'stdout'.

# cat ./t.sh {{{
#!/bin/bash

declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"

f2() {
    echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to stdout"

    echo "f2(): restore pipe" >>"$log_file"
    exec 1>&$fd_t_pipe-
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to pipe"
}
f() {
    echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    f2
    echo "f(): write to stdout"

    echo "f(): restore pipe" >>"$log_file"
    eval "exec 1>&$fd_t_pipe-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"

# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12713, BASH=12715, SUBSH=1
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w  FIFO    0,6        32160 pipe
t.sh    12715 root    2w   REG    8,7   43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12715 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): move pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12715 root    2w   REG    8,7  409 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12715 root    7w  FIFO    0,6        32160 pipe
t.sh    12715 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12713, BASH=12715, SUBSH=1
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12715 root    2w   REG    8,7  893 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12715 root    7w  FIFO    0,6        32160 pipe
t.sh    12715 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12715 root    2w   REG    8,7 1349 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12715 root    7w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): restore pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w   REG    8,7   22 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12715 root    2w   REG    8,7 1752 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): restore pipe
./t.sh: line 43: 7: Bad file descriptor
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12715 root    0r   CHR    1,5          920 /dev/zero
t.sh    12715 root    1w   REG    8,7   63 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12715 root    2w   REG    8,7 2105 1590776 /home/sgf/new_tree/src/send_sms/2.tmp

# }}}
# cat ./1.tmp {{{
f2(): write to stdout
f2(): write to pipe
f(): write to stdout
f(): write to pipe
v: ''

# }}}

Example-3.

And here is how this can be done correctly. Shortly, we should call f2() like

    ( f2 >/dev/null )

In this case, f2() result will be discarded (but we do not need it, right?), but
f() will output its result as expected.

# cat ./t.sh {{{
#!/bin/bash

declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"

f2() {
    echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to stdout"

    echo "f2(): restore pipe" >>"$log_file"
    exec 1>&$fd_t_pipe-
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f2(): write to pipe"
}
f() {
    echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): move pipe" >>"$log_file"
    eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    ( f2 >/dev/null )
    echo "f(): write to stdout"

    echo "f(): restore pipe" >>"$log_file"
    eval "exec 1>&$fd_t_pipe-"
    lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
    read -rsn1

    echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"

# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12777, BASH=12779, SUBSH=1
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12779 root    0r   CHR    1,5          920 /dev/zero
t.sh    12779 root    1w  FIFO    0,6        32627 pipe
t.sh    12779 root    2w   REG    8,7   43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12779 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): move pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12779 root    0r   CHR    1,5          920 /dev/zero
t.sh    12779 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12779 root    2w   REG    8,7  409 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12779 root    7w  FIFO    0,6        32627 pipe
t.sh    12779 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12777, BASH=12784, SUBSH=2
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12784 root    0r   CHR    1,5          920 /dev/zero
t.sh    12784 root    1w   CHR    1,3          898 /dev/null
t.sh    12784 root    2w   REG    8,7  893 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12784 root    7w  FIFO    0,6        32627 pipe
t.sh    12784 root    9w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12784 root   10w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12784 root    0r   CHR    1,5          920 /dev/zero
t.sh    12784 root    1w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh    12784 root    2w   REG    8,7 1410 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12784 root    7w   CHR    1,3          898 /dev/null
t.sh    12784 root   10w   REG    8,7    0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): restore pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12784 root    0r   CHR    1,5          920 /dev/zero
t.sh    12784 root    1w   CHR    1,3          898 /dev/null
t.sh    12784 root    2w   REG    8,7 1874 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12784 root   10w   REG    8,7   22 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): restore pipe
COMMAND   PID USER   FD   TYPE DEVICE SIZE    NODE NAME
t.sh    12779 root    0r   CHR    1,5          920 /dev/zero
t.sh    12779 root    1w  FIFO    0,6        32627 pipe
t.sh    12779 root    2w   REG    8,7 2248 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh    12779 root    9w   REG    8,7   43 1121662 /home/sgf/new_tree/src/send_sms/1.tmp

# }}}
# cat ./1.tmp {{{
f2(): write to stdout
f(): write to stdout
v: 'f(): write to pipe'

# }}}