Here is the task:
1. move all array elements to offset (index) 'off'. I.e. such, that first
array element will be at index 'off'.
2. make holes (sequences of unset elements) in the array by moving
different array parts to different offsets (indexes) off1, off2,
off3,.. .
This can't be done by
arr=( [i]="${arr[@]}" )
because "${[@]}" expansion works not as it should: it treats assignment prefix
specially, and in such case works the same as "${[*]}".
So, we should use some workaround, like
#!/bin/bash
declare -a arr=(
1
2
3
4
5
)
declare -i off=15
declare -p arr
b=${arr[0]}
unset arr[0]
arr=( [off]=$b "${arr[@]}" )
declare -p arr
This can be done simplier, but i'm not sure which version of bash this
requires
#!/bin/bash
declare -a arr=(
1
2
3
4
5
)
declare -i off=15
declare -p arr
arr=( [off]=${arr[0]} "${arr[@]:1}" )
declare -p arr
Second task can be solved like this
#!/bin/bash
declare -a arr=(
1
2
3
4
5
6
7
8
9
10
11
12
13
)
declare -i off1=15 len1=2 i1=0
declare -i off2=21 len2=3 i2=$((i1 + len1))
declare -i off3=40 len3=4 i3=$((i2 + len2))
declare -i off4=70 i4=$((i3 + len3))
declare -p arr
arr=(
[off1]=${arr[i1]} "${arr[@]:i1 + 1: len1 - 1}"
[off2]=${arr[i2]} "${arr[@]:i2 + 1: len2 - 1}"
[off3]=${arr[i3]} "${arr[@]:i3 + 1: len3 - 1}"
[off4]=${arr[i4]} "${arr[@]:i4 + 1}"
)
declare -p arr
Note, that all existed holes in the array after such operations would be lost.
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
четверг, 16 декабря 2010 г.
[bash] Move array elements
[bash] Write filenames to array.
Here is two general tasks:
1. Assign strings (e.g. filenames) separated by '\0' from some input
stream to corresponding array elements.
2. Convert array into stream consisting from strings separated by '\0'.
I.e we have some bash script, which somewhere get such stream, e.g. by `find`
find $root -wholename "*/$project" -prune -print0
and then we want to place this filenames into array elements. But there is a
problem: we can't use '\0' in IFS, so we can't split find's output stream
using bash word splitting expansion. And we can't use any other character to
separate filenames, because filename may contain any character.
One possible method (i don't know other, though it may be) is to transform
stream to bash code and then `eval` it.
But here is another problem: we can't simply escape string with double or
single quotes, because string may contain un-escaped double or single quotes
inside (find does not escape characters in filenames, and hence we assume,
that string contain un-escpaed characters, i.e written "as is"). For example
a"' b.txt
than after escaping with double quotes
"a"' b.txt"
or with single quotes
'a"' b.txt'
In both cases some part of string remain unescaped and not-matched quote
appears. (Write anothre example with command).
So, we can't escape string by commands like sed or awk, which perfom text
editing without string parsing. But we can use bash itself to parse and escape
string properly, and then output escaped result, like this
eval "$(find $root -wholename "*/$project" -prune -print0 \
| sort -z -s \
| xargs -0 -x bash -c '
arr=( "$@" );
declare -p arr
' escape_filename)"
(the last argument 'escape_filename' is used as $0. It may be any, but
required for correct work. For details see chapter 7.4.2 from 'info find')
Script for bash instance, invoked by xargs, may do some other operations with
strings, like
j=15;
eval "$(find $root -wholename "*/$project" -prune -print0 \
| sort -z -s \
| xargs -0 -x bash -c "
set -- \"\${@#$root/}\"
set -- \"\${@%$project}\"
arr2=( [$j]=\"\$1\" \"\${@:2}\" );
declare -p arr2
" escape_filename)"
Here we delete leading ($root) and trailing ($project) portions of filename
and then assign resulted set of strings to array starting at index $j.
Variables $root, $project and $j are substituted by main bash process before
executing pipeline.
If you use in this script array name you want assign to in main script, no
further editing of output will be needed.
Here is another example to what incorrect quoting may lead to. If we have in
input stream string like this
a' rm -rf ~ '
then quote it with single quotes and add assignment (with sed, for example)
var='a' rm -rf ~ ''
when it will be eval-ed it execute command
rm -rf ~
Here is sample script
#!/bin/bash
str="a' ls ~ '"
to_eval="$(echo "$str" | sed -e"s/^/var='/;s/$/'/")"
eval "$to_eval"
четверг, 7 октября 2010 г.
[bash][part][draft]Return values from bash function
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
UPD. 2010.09.07
UPD. 2010.09.07
If we need to return several values from function, we can return through
- either setting some global variables,
- or through pipe ("stdout").
- or use both.
1. Global variables.
If we do not want to hardcode these global variable names into script, we
can pass their names as arguments and then use `eval` to assign values in
child function,
f() {
..
eval "$1=\"a b\""
}
f g1
declare -p g1
but in this case there is possible name conflicts between local function's
variables and global ones. If local variable has the same name as global,
it replaces it and we no longer able to set global variable in this
child function.
2. Pipe.
If we use pipe, we'll first gather all values from local child function's
variables and send them into pipe, but then, in caller function, we'll
parse pipe content to separate these values again and assign to proper
variables. This is waste. Also, order, in which results will be outputed
into pipe, should be fixed since cutting of result into separate values
again and assigning them to proper variables into caller function assumes
certain order. And if this order sometimes accidently will change,
returned values will be messed.
f3() {
local l1='a b'
local l2=' c d'
echo "\"$l1\""
echo "\"$l2\""
return 0
}
eval "$(echo "$(f3)" | sed -e'1s/^/g1=/;2s/^/g2=/')"
declare -p g1 g2
3. Mixed.
But we can use both those techniques at the same time to eliminate all
listed above problems each of them have.
Names of global variables to be set should be passed through arguments,
but in the pipe we'll send not raw values, but assignment statements with
already expanded values, and in caller function we should simply execute
them (assignment statements) in `eval`. In such case we get:
- no name conflicts, since we're no longer need global variables into
child function: we need only their names to generate proper output,
but values to them will be assigned into `eval` in caller function.
- no parsing of pipe by caller, since child function knows names of
variables to which caller want to assign values, and send the output
with already prepared assignment statements (values in this
assignment statements must be expanded in child function, since
child's local variables will be deleted upon it returns).
- Order of variables in child function's output may be any (except,
very special cases, where one variable use the value of previous
ones), since all assignments already written and no parsing
required.
f() {
local l1='local 1'
local l2='local 2'
echo "g1=\"$l1\""
echo "g2=\"$l2\""
return 0
}
declare g1=''
declare g2=''
eval "$(f g1 g2)"
declare -p g1 g2
Note1:
- when function is called through command substitution, then it'll be
executed in the subshell (unlike, when it's called normally, it is
executed in the main shell), so you can _not_ set any parent shell's
variable inside it. Thus, all parent shell's variable, which should be
set must be written into pipe in assignment statements form.
- you can not reinitialize array in parent shell using 'arr=( ${arr[@]} )'
syntax, when writing to pipe, if array elements contain shell
'metacharacters' (not IFS characters!). This is because in `eval`-ed
script all values will be already expanded (yet before it'll be written
into pipe) and will not contain any quotes, so they'll be broken into
tokens (words and operators) by shell 'metacharacters' far before any
expansions (including word splitting by 'IFS' characters) will take
place.
- the only way to reinitialize array in parent shell is to write either
'arr[index]=value' for each element or `declare -p arr` into pipe.
Here is example.
# ./t.sh {{{
#!/bin/bash
f() {
echo "SUBSH=$BASH_SUBSHELL" >/dev/tty
arr1[3]=' g h '
arr2[3]='i k l'
IFS=" "
echo "arr1=( ${arr1[*]} )"
echo "$(declare -p arr2)"
return 0
}
declare -a arr1=(
'a b'
'c '
)
declare -a arr2=(
' d'
' e f '
)
eval "$(f)"
declare -p arr1 arr2
# }}}
Note2:
Even if we don't need any results from function, which write result into pipe,
following considerations should be looked at:
- such function should always be invoked in subshell, even if we don't
need its results. Otherwise, it'll be executed in the same shell as
parent function and its actions with FDs may result in unexpected FD
table state for parent.
- FD=1 of such function should always be connected to something like
'/dev/null'. Otherwise, it can write results somewhere you don't expect
them for.
Example-1.
Here both functions f() and f2() return result through pipe, but function f()
does not need what f2() returns, so f2() is called (before f() moves pipe)
without subshell and without pipe connected. This results in the following
problems:
- f2() closes 'fd_t_stdout' in the parent function's FD table, though f()
thinks it's still open. This resluts in "9: Bad file descriptor error",
when f() tries to move pipe and f() loses correct value for stdout.
Hence, f() writes into pipe both output intended for 'stdout' and
intended for 'pipe'.
- f2() writes its output intended for 'pipe' into the _same_ pipe, as f()
uses, because it was invoked before f() moves its own pipe.
# cat ./t.sh {{{
#!/bin/bash
declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"
f2() {
echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to stdout"
echo "f2(): restore pipe" >>"$log_file"
exec 1>&$fd_t_pipe-
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to pipe"
}
f() {
echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
f2
echo "f(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): write to stdout"
echo "f(): restore pipe" >>"$log_file"
eval "exec 1>&$fd_t_pipe-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"
# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12681, BASH=12683, SUBSH=1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w FIFO 0,6 31967 pipe
t.sh 12683 root 2w REG 8,7 43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12683 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12681, BASH=12683, SUBSH=1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w FIFO 0,6 31967 pipe
t.sh 12683 root 2w REG 8,7 438 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12683 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12683 root 2w REG 8,7 805 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12683 root 7w FIFO 0,6 31967 pipe
f2(): restore pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w FIFO 0,6 31967 pipe
t.sh 12683 root 2w REG 8,7 1175 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): move pipe
./t.sh: line 36: 9: Bad file descriptor
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w FIFO 0,6 31967 pipe
t.sh 12683 root 2w REG 8,7 1492 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): restore pipe
./t.sh: line 43: 7: Bad file descriptor
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12683 root 0r CHR 1,5 920 /dev/zero
t.sh 12683 root 1w FIFO 0,6 31967 pipe
t.sh 12683 root 2w REG 8,7 1812 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
# }}}
# cat ./1.tmp {{{
f2(): write to stdout
v: 'f2(): write to pipe
f(): write to stdout
f(): write to pipe'
# }}}
Example-2.
If f() moves its pipe before calling f2(), but calls it also without subshell,
this will not be better:
- f2() will write its output intended for 'pipe' into 'stdout', since
stdout instead of pipe will be opened at FD=1, when f2() starts.
- f2() replaces FD='fd_t_pipe' value into parent function's FD table, so
saved f()'s pipe will be replaced with 'stdout' (since 'stdout' will be
at FD=1 in f2()). So, f() loses correct value for its pipe and writes
into 'stdout' both output intended for pipe and intended for 'stdout'.
# cat ./t.sh {{{
#!/bin/bash
declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"
f2() {
echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to stdout"
echo "f2(): restore pipe" >>"$log_file"
exec 1>&$fd_t_pipe-
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to pipe"
}
f() {
echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
f2
echo "f(): write to stdout"
echo "f(): restore pipe" >>"$log_file"
eval "exec 1>&$fd_t_pipe-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"
# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12713, BASH=12715, SUBSH=1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w FIFO 0,6 32160 pipe
t.sh 12715 root 2w REG 8,7 43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12715 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): move pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12715 root 2w REG 8,7 409 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12715 root 7w FIFO 0,6 32160 pipe
t.sh 12715 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12713, BASH=12715, SUBSH=1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12715 root 2w REG 8,7 893 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12715 root 7w FIFO 0,6 32160 pipe
t.sh 12715 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12715 root 2w REG 8,7 1349 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12715 root 7w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): restore pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w REG 8,7 22 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12715 root 2w REG 8,7 1752 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
f(): restore pipe
./t.sh: line 43: 7: Bad file descriptor
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12715 root 0r CHR 1,5 920 /dev/zero
t.sh 12715 root 1w REG 8,7 63 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12715 root 2w REG 8,7 2105 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
# }}}
# cat ./1.tmp {{{
f2(): write to stdout
f2(): write to pipe
f(): write to stdout
f(): write to pipe
v: ''
# }}}
Example-3.
And here is how this can be done correctly. Shortly, we should call f2() like
( f2 >/dev/null )
In this case, f2() result will be discarded (but we do not need it, right?), but
f() will output its result as expected.
# cat ./t.sh {{{
#!/bin/bash
declare -r -i fd_t_stdout=9
declare -r -i fd_t_pipe=7
declare -r log_file='./2.tmp'
declare -r read_file='/dev/zero'
[ -f "$log_file" ] && rm -f "$log_file"
exec 2>>"$log_file" <"$read_file"
f2() {
echo "f2(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to stdout"
echo "f2(): restore pipe" >>"$log_file"
exec 1>&$fd_t_pipe-
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f2(): write to pipe"
}
f() {
echo "f(): start: \$\$: $$, BASH=$BASHPID, SUBSH=$BASH_SUBSHELL" >>"$log_file"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): move pipe" >>"$log_file"
eval "exec $fd_t_pipe>&1 1>&$fd_t_stdout"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
( f2 >/dev/null )
echo "f(): write to stdout"
echo "f(): restore pipe" >>"$log_file"
eval "exec 1>&$fd_t_pipe-"
lsof -a -p $BASHPID -d'^mem,^txt,^rtd,^cwd' >>"$log_file"
read -rsn1
echo "f(): write to pipe"
}
eval "exec $fd_t_stdout>&1"
v="$(f)"
echo "v: '$v'"
eval "exec $fd_t_stdout>&-"
# }}}
# ./t.sh >|./1.tmp {{{
f(): start: $$: 12777, BASH=12779, SUBSH=1
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12779 root 0r CHR 1,5 920 /dev/zero
t.sh 12779 root 1w FIFO 0,6 32627 pipe
t.sh 12779 root 2w REG 8,7 43 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12779 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): move pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12779 root 0r CHR 1,5 920 /dev/zero
t.sh 12779 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12779 root 2w REG 8,7 409 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12779 root 7w FIFO 0,6 32627 pipe
t.sh 12779 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): start: $$: 12777, BASH=12784, SUBSH=2
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12784 root 0r CHR 1,5 920 /dev/zero
t.sh 12784 root 1w CHR 1,3 898 /dev/null
t.sh 12784 root 2w REG 8,7 893 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12784 root 7w FIFO 0,6 32627 pipe
t.sh 12784 root 9w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12784 root 10w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): move pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12784 root 0r CHR 1,5 920 /dev/zero
t.sh 12784 root 1w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
t.sh 12784 root 2w REG 8,7 1410 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12784 root 7w CHR 1,3 898 /dev/null
t.sh 12784 root 10w REG 8,7 0 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f2(): restore pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12784 root 0r CHR 1,5 920 /dev/zero
t.sh 12784 root 1w CHR 1,3 898 /dev/null
t.sh 12784 root 2w REG 8,7 1874 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12784 root 10w REG 8,7 22 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
f(): restore pipe
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 12779 root 0r CHR 1,5 920 /dev/zero
t.sh 12779 root 1w FIFO 0,6 32627 pipe
t.sh 12779 root 2w REG 8,7 2248 1590776 /home/sgf/new_tree/src/send_sms/2.tmp
t.sh 12779 root 9w REG 8,7 43 1121662 /home/sgf/new_tree/src/send_sms/1.tmp
# }}}
# cat ./1.tmp {{{
f2(): write to stdout
f(): write to stdout
v: 'f(): write to pipe'
# }}}
четверг, 30 сентября 2010 г.
[draft][part] Shell I/O redirection
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
1. Open file descriptors table maps number (file descriptor - FD) into actual
file to which output will be written.
+----+------------+
| FD | File |
+----+------------+
| 0 | /dev/vc/1 |
+----+------------+
| 1 | /dev/vc/1 |
+----+------------+
| 2 | /dev/vc/1 |
+----+------------+
| ... |
2. Redirection operators (shell) affect (change) only 'File' column of the
table. So, if program writes to FD=5 (write(5,..); call) you can _not_
force it to write to another FD with shell redirection - you can only
change the content of 'File' column in the FD=5 row (i.e to what file FD=5
is mapped) and through that change to where output arrives at the end (to
which file).
3. '/dev/stdout' and '/dev/stderr' are _not_ an actual files, they're _links_
to what is _now_ opened at FD=1 and FD=2 correspondingly. I.e links to
file specified in 'File' column of FD=1 and FD=2 rows.
# ls -l /dev/stdout /dev/stderr
lrwxrwxrwx 1 root root 15 Sep 30 07:56 /dev/stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Sep 30 07:56 /dev/stdout -> /proc/self/fd/1
4. In construction like
f() {
echo "abc"
return 0
}
res=$(f)
function f() result are sent through pipe to the caller. I think, this is
not clear to say, that result are sent through "stdout", though it's
correct, since "stdout" is _not_ an actual file - it's a link to what is
now opened at file descriptor 1, and now there will be pipe. By default,
when function invoked through command substitution (function will be
executed in a subshell), pipe is opened for writing at FD=1 for child
process (function) and for reading at FD=3 for caller process. So, if you
use something like
echo "abc"
"abc" will be written into pipe (because `echo` always writes to FD=1). But
if you want, that the way how we send result does not affect function's
code, we should move 'pipe' from FD=1 into some unused FD and restore
original FD=1 content.
exec 7>&1 1>&2
When result will be ready, we move 'pipe' back to FD=1 and `echo` result
into it.
exec 1>&7-
echo "result"
Note, that for all child subshells pipe will be opened as well.
Example. Illustrates complete implementation, with several stacked functions
returning result through pipe.
#!/bin/bash
log='./t.log'
read_f='/dev/null'
rm -f $log
f2() {
echo "f2(): SUBSH=$BASH_SUBSHELL" >>$log
echo "f2(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo "f2(): stdout-1: ghi"
echo "f2(): stderr-1: klm" >/dev/stderr
echo "f2(): Before moving pipe somewhere" >>$log && read -n1 < $read_f
eval "exec $save_pipe>&1 1>&$save_stdout-"
echo "f2(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo "f2(): stdout-2: ghi"
echo "f2(): stderr-2: klm" >/dev/stderr
echo "f2(): Before exit f2()" >>$log && read -n1 < $read_f
return 0
}
f() {
echo "f(): SUBSH=$BASH_SUBSHELL" >>$log
echo "f(): \$\$[$$]:" >>$log
lsof -a -p $$ -d '^mem,^cwd,^rtd,^txt' >>$log
echo "f(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo 'f(): stdout-1: abc'
echo 'f(): stderr-1: def' >/dev/stderr
echo "f(): Before moving pipe somewhere" >>$log && read -n1 < $read_f
eval "exec $save_pipe>&1 1>&$save_stdout-"
echo "f(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo 'f(): stdout-2: abc'
echo 'f(): stderr-2: def' >/dev/stderr
echo "f(): Before calling f2()" >>$log && read -n1 < $read_f
eval "exec $save_stdout>&1"
v=$(f2)
echo "-$v-"
eval "exec $save_stdout>&-"
echo "f(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo "f(): Before restoring pipe" >>$log && read -n1 < $read_f
exec >&$save_pipe-
echo 'f(): stdout-3: abc'
echo 'f(): stderr-3: def' >/dev/stderr
echo "f(): BASHPID[$BASHPID]:" >>$log
lsof -a -p $BASHPID -d '^mem,^cwd,^rtd,^txt' >>$log
echo "f(): Before exit f()" >>$log && read -n1 < $read_f
return 0
}
declare -r -i save_stdout=9
declare -r -i save_pipe=7
eval "exec $save_stdout>&1"
v=$(f)
eval "exec $save_stdout>&-"
echo "in main()"
echo "\$\$[$$]:" >>$log
lsof -a -p $$ -d '^mem,^cwd,^rtd,^txt' >>$log
echo "-$v-"
Here is illustration:
(subshell)
main() ....................
+--------------+ . f() .
| 3r pipe | call f() . +--------------+ .
| 9u "stdout" |----------->| 1w pipe(f) | .
+--------------+ . | 9u "stdout" | .
. +--------------+ .
. | .
. | move pipe(f)
. | .
. v . (subshell)
. +--------------+ . ....................
. | 1w "stdout" | . . f2() .
. | 7w pipe(f) | call f2() . +--------------+ .
. | 9- (closed) |------------>| 1w pipe(f2) | .
. +--------------+ . . | 7w pipe(f) | .
. . . | 9u "stdout" | .
. . . +--------------+ .
. . . | .
. . . | move pipe(f2)
. . . | over pipe(f)
. . . | .
. . . v .
. . . +--------------+ .
. +--------------+ . return . | 1w "stdout" | .
. | 1w "stdout" |<------------| 7w pipe(f2) | .
. | 7w pipe(f) | . . | 9- (closed) | .
. +--------------+ . . +--------------+ .
. | . ....................
. | restore pipe(f)
. | .
. v .
. +--------------+ .
+--------------+ return . | 1w pipe(f) | .
| 1u "stdout" |------------| 7- (closed) | .
+--------------+ . +--------------+ .
....................
And here is log (slightly edited)
# ./t.sh >|./1.tmp
f(): SUBSH=1
f(): $$[4794]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4794 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4794 root 1w REG 8,7 0 1121662 .../1.tmp
t.sh 4794 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4794 root 3r FIFO 0,6 14132 pipe
t.sh 4794 root 9w REG 8,7 0 1121662 .../1.tmp
t.sh 4794 root 255r REG 8,7 2075 1121653 .../t.sh
f(): BASHPID[4796]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4796 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 1w FIFO 0,6 14132 pipe
t.sh 4796 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 9w REG 8,7 0 1121662 .../1.tmp
f(): Before moving pipe somewhere
f(): BASHPID[4796]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4796 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 1w REG 8,7 0 1121662 .../1.tmp
t.sh 4796 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 7w FIFO 0,6 14132 pipe
f(): Before calling f2()
f2(): SUBSH=2
f2(): BASHPID[4803]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4803 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4803 root 1w FIFO 0,6 14194 pipe
t.sh 4803 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4803 root 7w FIFO 0,6 14132 pipe
t.sh 4803 root 9w REG 8,7 19 1121662 .../1.tmp
f2(): Before moving pipe somewhere
f2(): BASHPID[4803]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4803 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4803 root 1w REG 8,7 19 1121662 .../1.tmp
t.sh 4803 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4803 root 7w FIFO 0,6 14194 pipe
f2(): Before exit f2()
f(): BASHPID[4796]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4796 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 1w REG 8,7 61 1121662 .../1.tmp
t.sh 4796 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 7w FIFO 0,6 14132 pipe
f(): Before restoring pipe
f(): BASHPID[4796]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4796 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4796 root 1w FIFO 0,6 14132 pipe
t.sh 4796 root 2u CHR 4,1 5466 /dev/vc/1
f(): Before exit f()
$$[4794]:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
t.sh 4794 root 0u CHR 4,1 5466 /dev/vc/1
t.sh 4794 root 1w REG 8,7 71 1121662 .../1.tmp
t.sh 4794 root 2u CHR 4,1 5466 /dev/vc/1
t.sh 4794 root 255r REG 8,7 2075 1121653 .../t.sh
четверг, 20 мая 2010 г.
[summary][draft][part] Tabs in vim
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
Status: summary.
State: draft,part.
Detailed description: Some tables for illustrate <Tab>'s handling in vim. See vim help for details.
Status: summary.
State: draft,part.
Detailed description: Some tables for illustrate <Tab>'s handling in vim. See vim help for details.
Tabs in vim {{{
'sts' and 'sta' options {{{
Following options affect <Tab>s and indents:
'tabstop' 'ts'
'shiftwidth' 'sw'
'softtabstop' 'sts'
'smarttab' 'sta'
'expandtab' 'et'
'sw' and 'ts' define standard vim operations behavior, they are not
switches for some features (hence they're always set and always used).
But 'sts' and 'sta' enables additional features, which affect (change)
some vim operations behavior (hence, they can be unset to turn feature
off).
Table below shows what option will be used to determine how many
positions insert or delete during some editing operations depending on
activated modes: both 'sts' and 'sta' are off (default), 'sts' set,
'sta' set and both 'sts' and 'sta' set. When editing operation insert
less (or more) positions, than real <Tab> counts for, mix from spaces
and real <Tab>s are used.
+--------------------------+-------------------------------------------+
| Operation | Will be inserted .. positions |
| +------+-----------+-----------+------------+
| | | +sts | +sta | +sts +sta |
+--------------------------+------+-----------+-----------+------------+
| Use >> , etc | sw | sw | sw | sw |
+--------------------------+------+-----------+-----------+------------+
| Type <Tab> or <BS> | ts | sts (mix) | | |
| + + +-----------+------------+
| at the start of line | | | sw (mix) | sw (mix) |
| + + +-----------+------------+
| in other places | | | ts | sts (mix) |
+--------------------------+------+-----------+-----------+------------+
| Real <Tab> length | ts | ts | ts | ts |
+--------------------------+------+-----------+-----------+------------+
}}}
':retab' and changing 'ts' option value {{{
'ts' option changes real tabstop, but does not change text. Hence,
indents, which made according to old tabstop, probably will be messed
(i.e there will be visible changes in text), but actual file remains
untouched. So, using 'undo' after changing 'ts' has no sense.
:retab command changes both 'ts' option and text according to new 'ts'
value in such way, that all indents remain the same (there will be no
visible changes), though actual file will be changed (to preserve
visible indents :retab pads them, if necessary, with spaces). So,
using 'undo' after :retab recover text to previous state, but not
recover 'ts' to previous value, hence indents may be messed (like
after you change only 'ts') and to recover visible text state you need
set 'ts' to previous value. Table below summarizes that.
+-------------------+---------------+-----------+
| | Change 'ts' | :retab |
+-------------------+---------------+-----------+
| 'ts' changed? | yes | yes |
+-------------------+---------------+-----------+
| File changed? | no | yes |
+-------------------+---------------+-----------+
| Visible changes? | yes | no |
+-------------------+---------------+-----------+
}}}
}}}
вторник, 4 мая 2010 г.
[method][draft][part][need_check] (bash) passing arguments to function
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
Status: method.
State: suggestion,draft,part,need_check.
Detailed description: One more method for passing arguments to bash function.
Outline: below under 'passing argument as pointer' i mean passing argument by name:
Status: method.
State: suggestion,draft,part,need_check.
Detailed description: One more method for passing arguments to bash function.
Outline: below under 'passing argument as pointer' i mean passing argument by name:
function f() {
}
declare var='abc'
f var # not f "$var"
This may be considered as analogy for pointers (in C), i think :-)1. Passing argument as pointer to function sometimes may require redefinition
it as local variable in order to preserve referred data. Though, generally
pointer used to provide to function ability to modify caller's environment,
in bash it may be used, to skip unnecessary copying of big argument, if we
want to redefine all positional parameters to local variables (names
1, 2, 3,.. is bad whenever, and also positional parameters may be used for
some tricks, so it's better, when they do not contain useful data).
eval "
$(declare -p $2 | sed -e's/^[^=]\+=/local -a shortopts=/' );
$(declare -p $3 | sed -e's/^[^=]\+=/local -a longopts=/');
"
Above construction is, seemingly, the only way. To correctly redefine
pointer to local variable we should honour several contraints:
1. we should not define any local variable before all pointers will be
expanded, because otherwise there is a chance, that defined local
variable name is match with some pointed to variable's name and we
lose access to it.
2. we should preserve all special characters inside pointed data.
To handle first constraint, we need `eval`.
To handle second, we should properly quote expanded pointed to variable's
value:
${[@]} - split elements correctly only during first expand, but in the
second (done in `eval`) these boundaries will be lost and bash splits
elements by 'metacharacters', not by IFS or anything else (info bash,
ch-2).
${[*]} - my be used, perhaps, with IFS=',', and then.. something
artificial to set up quoting by brace expansion. But this rarely works,
though (i don't know working example).
`declare` - seems, only way to obtain correct quoting without many
evals and artificial tricks. But its output, perhaps, should be
processed by some program to replace variable name to desired one.
[summary][draft][part] Bash script parsing
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
Status: summary.
State: draft,part.
Detailed description: illustration for info bash ch-2.
Status: summary.
State: draft,part.
Detailed description: illustration for info bash ch-2.
Bash script parsing units:
token = single unit
/ \
/ \
/ \
/ \
word operator
/ | \ / \
/ | \ / \
/ | \ / \
reserved name .. control redirection
word operator operator
metacharacter - separate words (or tokens?)
/ \
/ \
/ \
blank some (why not all?)
control operators
пятница, 23 апреля 2010 г.
[additions][done][unmaintained] Дополнение к инструкции по рациям Joker/Kenwood TK/JK-450S.
Status: additions.
State: done,unmaintained.
Detailed description: ...
TODO: [dropped] про набор номера канала.
TODO: [dropped] что включает функция F+0 ?
State: done,unmaintained.
Detailed description: ...
TODO: [dropped] про набор номера канала.
TODO: [dropped] что включает функция F+0 ?
Таблица настроек субтонов CTCSS/DCS.
+-----------------------+--------------------------------------+
| Действие | Настройка |
| +------------------+-------------------+
| | CTCSS | DCS |
+-----------------------+------------------+-------------------+
| Вкл/выкл CTCSS/DCS | (20) CT.DCS |
+-----------------------+--------+---------+---------+---------+
| Уст. тон приема | (7) RC | (21) CT | (9) RD | (22) DC |
+-----------------------+--------+ +---------+ |
| Уст. тон передачи | (8) TC | | (10) TD | |
+-----------------------+--------+---------+---------+---------+
Примечания.
1. настройки (21) CT и (22) DC устанавливают и тон приема, и
тон передачи в одинаковые значения. Поэтому, если тон
приема и передачи будут одинаковые, то ими пользоваться
удобнее, чем двумя разными настройками (7)/(8) или
(9)/(10).
Настройка разных субтонов на прием/передачу и сдвиг частот приема/передачи.
Сдвиг (как частот, так и субтонов) будет работать только при разговоре
двух раций: третья (R-3) рация сможет либо общаться только с первой
(R-1), либо только со второй (R-2). Для разговора c первой надо
настроить третью, как вторую (см. схему ниже). В этом случае, третья
не будет слышать вторую, а вторая не будет слышать третью. Для
разговора со второй надо настроить третью, как первую.
+-----------------------+
v |
Rx +--------------------> Rx
[Ch=B, DCS=Ab] | | [Ch=A, DCS=Ad]
+-----+ | | +-----+
| R-1 | Tx ------+ +------ Tx | R-2 |
| | [Ch=A, DCS=Ad] [Ch=B, DCS=Ab] | |
+-----+ +-----+
Обозначения:
Ch - канал (частота);
DCS - тон DCS;
Последовательность настройки:
A=433.265, Ad=025I, B=433.255, Bd=023N
1. Настраиваем рации на частоту приема.
R-1: ставим частоту 433.255;
R-2: ставим частоту 433.265;
2. Установить сдвиг частот.
R-1: (6) OFFSET ставим на 0.010;
R-2: (6) OFFSET ставим на 0.010;
3. Включить сдвиг частот.
R-1: (5) SFT ставим на '+';
R-2: (5) SFT ставим на '-';
Сдвиг частот включен. Проверяем, что связь между рациями
работает.
4. Настраиваем DCS тон передачи.
R-1: (10) Td ставим на '025I';
R-2: (10) Td ставим на '023N';
5. Настраиваем DCS тон приема.
R-1: (9) Rd ставим на '023N';
R-2: (9) Rd ставим на '025I';
6. Включаем использование DCS тонов.
R-1: (20) CT.DCS ставим на 'DCS';
R-2: (20) CT.DCS ставим на 'DCS';
DCS тона включены. Проверяем, что связь между рациями
работает.
Режимы сканирование:
(1) SCAN= TO - остановка на найденном сигнале и продолжение
через 5 секунд;
CO - остановка на найденном сигнале и продолжение через 5
секунд после пропадания сигнала;
SE - остановка на первом найденном сигнале;
Другие настройки:
(2) TOT - Tx timeout Timer
время (в секундах), после которого передача отключится даже
при нажатой PTT (на случай залипшей PTT).
(11) APO - Automatic Power-off
автоматическое отключение рации через заданное время (в
минутах).
В этой заметке использованы материалы с форумов
tucson-club.ru (tucson-club.ru/forum/...)
и
lpd.radioscanner.ru (lpd.radioscanner.ru/topic...).
четверг, 4 февраля 2010 г.
LVM notes [draft]
DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.
Status: Many chapters missed and not posted here yet. Formatting may contain errors and missed entries.
UPD-01-03-10_14-32:
+ Title changed.
+ Metadata circular buffer description (draft) added.
+ LVM extent and stripe difference (in russian yet)
UPD:[2010.02.10]: небольшие улучшения в форматировании.
FIXME: длина строка >= 80.
FIXME: отступы.
FIXME: <> как обозначения, а не как html-тэги.
FIXME: расстояние между словами (см. вывод `od`).
FIXME: и шрифт дурацкий: нули почему-то меньше остальных букв :D
Text below may look much better in vim with folding enabled ('fdm=marker', 'fmr={{{,}}}'). Though indents are probably remain incorrect. Perhaps, someday i'll fix this :-)
(копия письма :-)
<...>
А под stripe-ами я имел в виду логический том LVM с stripe mapping, те,
например, вот такая команда
`lvcreate -vvv -i3 -I32 -l32000 -n striped_lv_3x32k test_vg`
И непонятно мне было почему LVM использует два вида блоков - extent и stripe,
и почему нельзя было реализовать все виды отображения (mapping) - и линейное,
и stripe mapping, - используя только один вид блоков. Но я тут еще посмотрел
логи `lvcreate -vvv` и в части, относящейся к активации логического раздела,
мне кажется, я нашел ответ. Хотя не уверен, что полностью правильный :-) Вот,
например:
--- Volume group ---
VG Name vg_4k
PE Size 4.00 KB
VG UUID M7TWGx-Rggp-Okru-nWGY-c7Mh-MaLi-LhYxFm
теперь если создать логический том (размером 32000 extent-ов) с stripe
mapping, разделенный на 3 физических тома и с размером stripe-а в 4КБайта
получится:
`lvcreate -vvv -i3 -I4 -l32000 -n s_lv_3x16k vg_4k`
<..>
Creating vg_4k-s_lv_3x16k
dm create vg_4k-s_lv_3x16k
LVM-M7TWGxRggpOkrunWGYc7MhMaLiLhYxFm0YvE0mFyjZS7Q103PbM4efKUb2lUXnR8
NF [16384]
Loading vg_4k-s_lv_3x16k table
Adding target: 0 256008 striped 3 8 7:1 384 7:2 384 7:0 384
dm table (253:0) OF [16384]
dm reload (253:0) NF [16384]
Resuming vg_4k-s_lv_3x16k (253:0)
dm resume (253:0) NF [16384]
<..>
Хотя это, конечно, и без лога известно было, но все же. Т.е вся
функциональность LVM реализована через device-mapper, но device-mapper ничего
не знает об extent-ах и не использует их. В своих таблицах он (`dmsetup
table`) для всех размеров и смещений использует дисковые блоки (512байт). А
для striped таблиц он также использует stripe - блок данных, который будет
записан на одно физическое устройство (т.е стандартное определение stripe-а).
Т.е получается, что device-mapper как раз и использует всего один тип блоков -
только stripe. И получается, что extent - блок, используемый только для
удобства управления LVM томами, а при работе LVM (I/O) он не используется. Т.е
блок, используемый только LVM тулсетом. Тогда становится понятно вот это
замечание из описания опции '-s' в `man vgcreate`:
-s, --physicalextentsize PhysicalExtentSize[kKmMgGtT]
<..>
If the volume group metadata uses lvm2 format those restrictions
do not apply, but having a large number of extents will slow
down the tools but have no impact on I/O performance to the log-
ical volume. The smallest PE is 1KB.
А ограничения на размеры stripe и extent, видимо, сделаны для того, чтобы все
они друг в друге помещались: 512байт - дисковый блок - степень двойки,
поэтому, наверно, stripe и extent тоже должны быть степенью двойки, чтобы
содержали целое число дисковых блоков. Кроме того, т.к и stripe, и extent
степень двойки, extent всегда будет содержать целое количество stripe-ов.
Правда, не очень понятно, почему device-mapper не позволяет устанавливать
размер stripe-а меньше 4Кбайт.
<...>
Status: Many chapters missed and not posted here yet. Formatting may contain errors and missed entries.
UPD-01-03-10_14-32:
+ Title changed.
+ Metadata circular buffer description (draft) added.
+ LVM extent and stripe difference (in russian yet)
UPD:[2010.02.10]: небольшие улучшения в форматировании.
FIXME: длина строка >= 80.
FIXME: отступы.
FIXME: <> как обозначения, а не как html-тэги.
FIXME: расстояние между словами (см. вывод `od`).
FIXME: и шрифт дурацкий: нули почему-то меньше остальных букв :D
Text below may look much better in vim with folding enabled ('fdm=marker', 'fmr={{{,}}}'). Though indents are probably remain incorrect. Perhaps, someday i'll fix this :-)
Draft of metadata sectors layout:
1. LVM label sector (0-r1) {{{
Location: {{{
- By default, `pvcreate` places the physical volume label
in the sector 1 (2nd 512-byte block).
- This label can optionally be placed in any of the first
four sectors ('--labelsector' option). And due to LVM
tools scan this first four sector for PV label, zeroing
of them ('-Zy' option) is recommended.
}}}
Format: {{{
Let consider on example:
> `pvcreate -vvv -Zy -M2 --metadatacopies=[012] --uuid=.. /dev/sdb4` {{{
>
> /dev/sdb4: size is 58589055 sectors
> with mcopies=0: 58588927 available sectors
> with mcopies=1: 58588671 available sectors
> metadata area at sector 8 size 376 sectors
> with mcopies=2: 58588416 available sectors
> metadata area at sector 8 size 376 sectors
> metadata area at sector 58588800 size 255 sectors
>
> Area sizes (available sectors) in hex:
> data area size =
> (with mcopies=0) = 58588927 sectors = 0x37dfeff sectors = 0x06fbfdfe00 bytes
> (with mcopies=1) = 58588671 sectors = 0x37dfdff sectors = 0x06fbfbfe00 bytes
> (with mcopies=2) = 58588416 sectors = 0x37dfd00 sectors = 0x06fbfa0000 bytes
> meta area size =
> (1st meta area) = 376 sectors = 192512 bytes = 0x02f000 bytes
> (2nd meta area) = 255 sectors = 130560 bytes = 0x01fe00 bytes
>
> Area offsets in hex:
> data area offset =
> (with mcopies=0) = 128 sectors = 65536 bytes = 0x010000 bytes
> (with mcopies=1,2) = 384 sectors = 196608 bytes = 0x030000 bytes
> meta area offset =
> (1st meta area) = 8 sectors = 4096 bytes = 0x1000 bytes
> (2nd meta area) = 58588800 sectors = 0x37dfe80 sectors = 0x06fbfd0000 bytes
>
> PV UUID = 'pesv0I-D0Ok-cVts-73Pg-vIaN-IRz2-LSldOn'
>
> Sector 1 dump: {{{
>
> Below for each 16 bytes row 1st line is for mcopies=0, 2nd - mcopies=1, 3rd -
> mcopies=2.
>
> 000200 4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00
> L A B E L O N E soh nul nul nul nul nul nul nul
> 000200 4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00
> L A B E L O N E soh nul nul nul nul nul nul nul
> 000200 4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00
> L A B E L O N E soh nul nul nul nul nul nul nul
> --
> 000210 e3 bb 4a cb 20 00 00 00 4c 56 4d 32 20 30 30 31
> c ; J K sp nul nul nul L V M 2 sp 0 0 1
> 000210 3c a9 89 2c 20 00 00 00 4c 56 4d 32 20 30 30 31
> < ) ht , sp nul nul nul L V M 2 sp 0 0 1
> 000210 76 4b 22 d4 20 00 00 00 4c 56 4d 32 20 30 30 31
> v K " T sp nul nul nul L V M 2 sp 0 0 1
> --
> 000220 70 65 73 76 30 49 44 30 4f 6b 63 56 74 73 37 33
> p e s v 0 I D 0 O k c V t s 7 3
> 000220 70 65 73 76 30 49 44 30 4f 6b 63 56 74 73 37 33
> p e s v 0 I D 0 O k c V t s 7 3
> 000220 70 65 73 76 30 49 44 30 4f 6b 63 56 74 73 37 33
> p e s v 0 I D 0 O k c V t s 7 3
> --
> 000230 50 67 76 49 61 4e 49 52 7a 32 4c 53 6c 64 4f 6e
> P g v I a N I R z 2 L S l d O n
> 000230 50 67 76 49 61 4e 49 52 7a 32 4c 53 6c 64 4f 6e
> P g v I a N I R z 2 L S l d O n
> 000230 50 67 76 49 61 4e 49 52 7a 32 4c 53 6c 64 4f 6e
> P g v I a N I R z 2 L S l d O n
> --
> 000240 00 fe fd fb 06 00 00 00 00 00 01 00 00 00 00 00
> nul ~ } { ack nul nul nul nul nul soh nul nul nul nul nul
> 000240 00 fe fb fb 06 00 00 00 00 00 03 00 00 00 00 00
> nul ~ { { ack nul nul nul nul nul etx nul nul nul nul nul
> 000240 00 00 fa fb 06 00 00 00 00 00 03 00 00 00 00 00
> nul nul z { ack nul nul nul nul nul etx nul nul nul nul nul
> --
> 000250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> --
> 000260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000260 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul dle nul nul nul nul nul nul
> +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +a +b +c +d +e +f
> 000260 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul dle nul nul nul nul nul nul
> --
> 000270 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000270 00 f0 02 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul p stx nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000270 00 f0 02 00 00 00 00 00 00 00 fd fb 06 00 00 00
> nul p stx nul nul nul nul nul nul nul } { ack nul nul nul
> --
> 000280 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000280 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +a +b +c +d +e +f
> 000280 00 fe 01 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul ~ soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> --
> 000290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 000290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> --
> (last NULL line repeated up to sector end)
>
> }}}
>
> }}}
Notes: {{{
- Size and offset below is in bytes.
- All ranges below includes their boundaries.
- Numbers (like size and offset) written on disk with least
significant byte first. I.e least significant byte of
number on disk will has less offset from disk beginning,
than most significant.
- All areas location rounded by sector boundary (512 bytes
block), hence all size and offset values will be
divisible by 512 (0x200), hence in hex them will always
has NULL least significant byte (0x00).
- I don't know for sure, is this NULL least significant
byte actually written on disk or not, but rather is, than
not. The most confusing place with this byte is beginning
of data area size after UUID: NULL-byte at 0x240+0 may be
NULL-termination byte after UUID as well as least
significant byte of data area size value. I assume, that
this is NULL least significant byte (see below).
}}}
Assumption: {{{
- NULL least significant byte written on disk as is.
- size value always occupy +0 -> +7 range of row.
- offset value always occupy +8 -> +f range of row.
- so, size and offset can be up to 16-digit hex number and
occupy exactly one row.
}}}
Some sort of prove for assumptions: {{{
- Throwing out part of number (NULL least significant byte)
and not writing it to disk will not save space allocated
for metadata, but may cause problems in future, when this
least significant byte may become not NULL. So i don't
see any sane reason for this.
- Not writing NULL-termination byte after UUID looks
possible, because UUID size is fixed (i think).
- If offset and size regions on disk occupy different
number of bytes, we either can not provide a method to
address all allocated size (through offset) or will have
offset values, which never be used - both is only
introduces future incompatibilities. Hence, offset and
size maximum allowed value should be the same and them
should occupy the same number of bytes on disk.
- Splitting one 16 bytes row on two equal parts by 8 bytes
for size (fisrt) and offset (second) seems to be very
reasonable: we have size and offset disk regions occupy
the same number of bytes and also, as example shows,
entire values (including NULL least significant byte)
written into this regions.
- Eventually, i do not find any conflict with any of my
assumptions :-)
}}}
Short Physical Volume label sector format: {{{
0x200 <LVM_magic_string>
0x210 <smth_unknown_and_LVM_version>
0x220 <PV_UUID>
0x230 <PV_UUID(continue)>
0x240 <data_size(8b)> <data_offset(8b)>
0x250 <NULL>
0x260 <NULL(8b)> <1st_meta_offset(8b)>
0x270 <1st_meta_size(8b)> <2nd_meta_offset(8b)>
0x280 <2nd_meta_size(8b)> <NULL(8b)>
0x290 <NULL_(up_to_0x400)>
}}}
Detailed Physical Volume label sector format: {{{
0x200+0 -> 0x200+f: LVM magic string. Identical for all
three PVs.
0x210+0 -> 0x210+7: Unknown.
0x210+8 -> 0x210+b: LVM version (i suppose).
0x210+c -> 0x210+c: Simply a separator (i suppose).
0x210+d -> 0x210+f: LVM PV label sector (this sector)
format version (i suppose).
0x220+0 -> 0x230+f: PV UUID.
0x240+0 -> 0x240+7: Data area size (number of available
bytes). Always present.
0x240+8 -> 0x240+f: Data area offset. Always present.
0x250+0 -> 0x250+f: NULL. Why?
0x260+0 -> 0x260+7: NULL. Why?
0x260+8 -> 0x260+f: 1st metadata circular buffer offset.
Only for mcopies=1,2.
0x270+0 -> 0x270+7: 1st metadata circular buffer size.
Only for mcopies=1,2.
0x270+8 -> 0x270+f: 2nd metadata circular buffer offset.
Only for mcopies=2.
0x280+0 -> 0x280+7: 2nd metadata circular buffer size
Only for mcopies=2.
0x280+8 -> 0x3f0+f: NULL and seems not used.
}}}
Last notes: {{{
- metadata area size and location (sector number, where
metadata circular buffer begins) can be obtained from
`pvcreate -vvv` output. If no metadata areas (circular
buffer) selected during PV creation (`pvcreate
--metadatacopies=0`), than metadata buffer area size will
be set to 120 sectors (it seems, that this is the lowest
size), though will not be used (no pointers to metadata
buffer will be in label sector and sector 8 remains
unchanged as well). This can be determined by
substraction count of available sectors from entire count
of sectors on disk (you get 128 = 120 + 8).
}}}
}}}
}}}
2. LVM circular buffer (0-r1) {{{
Short sector 8 format: {{{
0x1000 <Unknown>
0x1010 <Unknown(8b)> <1st_meta_offset(8b)>
0x1020 <1st_meta_size(8b)> <latest_entry_offset(8b)>
0x1030 <latest_entry_size(8b)> <Unknown(8b)>
0x1040 <NULLs_(up_to_0x1200)>
}}}
Detailed sector 8 format: {{{
0x1000+0 -> 0x1000+f: Unknown (2).
0x1010+0 -> 0x1010+7: Unknown (2).
0x1010+8 -> 0x1010+f: 1st metadata circular buffer (this
buffer) offset.
0x1020+0 -> 0x1020+7: 1st metadata circular buffer (this
buffer) size.
0x1020+8 -> 0x1020+f: Latest metadata entry in 1st circular
buffer (this buffer) offset. Offset
from beginning of the buffer, but
NOT from beginning of the PV.
0x1030+0 -> 0x1030+7: Latest metadata entry in 1st
circular buffer (this buffer) size,
including null-terminator.
0x1030+8 -> 0x1030+f: Unknown (1).
0x1040+0 -> 0x11f0+f: NULLs and seems not used.
(1): This number rather is not:
- first unallocated PE (checked by value on example).
- PE size (checked by value on example).
- PV size (checked by value on example).
- 2nd metadata buffer offset (it presents even in PVs
with single metadata buffer).
- latest metadata entry timestamp or smth else related
to latest metadata entry (it differs for different
PVs in the same VG, but latest metadata entry are
the same for all PVs of the same VG).
Also:
- this value does not divisible without remainder by
1024 or 512.
(2): Notes:
- range 0x1000+4 -> 0x1010+7 seems to be the same for
all PVs (even from different VGs).
}}}
Metadata entries location: {{{
- lvm metadata entry on disk location aligned roughly by
sector boundary (512 bytes block). I.e words such
vg_mp3 {
id = "YWvzHx-M1X5-TWtl-vCD1-w2zn-y0da-qui6PK"
seqno = 50
status = ["RESIZEABLE", "READ", "WRITE"]
will be placed only at sector's boundary (beginning).
- lvm metadata entry can occupy several sectors, though, if
last occupied sector not fully filled, all trailing
sector's part will not be cleared and, hence, it can
contain some garbage (exactly, some part of data from
previous record, occuping this sector).
}}}
Metadata entries format: {{{
- Each metadata entry on disk ends with null-terminator.
- On disk metadata timestamp (information about how and
when metadata entry was created) written after VG
description (information about volume group structure,
see below) to which it relates (in contrast with metadata
backup file, produced by `vgcfgbackup`, where timestamps
written first).
- On disk in metadata timestamp 'description' field is
empty (but in metadata backup file, produced by
`vgcfgbackup`, is not). (why?)
}}}
Last notes: {{{
- When PV contain no metadata circular buffer areas
('--metadatacopies=0' by `pvcreate`), than restoring VG
does not change anything in the PV metadata.
- To obtain offset from beginning of PV to latest metadata
entry in circular buffer, sum up '0x1010+8 -> 0x1010+f'
value with '0x1020+8 -> 0x1020+f' value.
- In order to locate latest metadata entry in raw 'on disk'
metadata copy, you should look up metadata circular
buffer (mostly, starting from sector 9, but if not, exact
offsets you can obtained from PV label sector) dump
splitted be sectors (512 bytes block) for sectors,
beginning like
vg_mp3 {
id = "YWvzHx-M1X5-TWtl-vCD1-w2zn-y0da-qui6PK"
seqno = 50
status = ["RESIZEABLE", "READ", "WRITE"]
This is the beginning of correct metadata entry.
Afterwards, you should choose one with latest 'seqno'
field. As explained above, simply search by word 'seqno'
may match with some garbage data after end of correct
metadata entry. Though, because we look up for entry with
latest 'seqno', anyway we'll select correct one. Also,
note, that 'seqno' word may appear as garbage only in few
first bytes of sector.
}}}
}}}
LVM extent and stripe:(копия письма :-)
<...>
А под stripe-ами я имел в виду логический том LVM с stripe mapping, те,
например, вот такая команда
`lvcreate -vvv -i3 -I32 -l32000 -n striped_lv_3x32k test_vg`
И непонятно мне было почему LVM использует два вида блоков - extent и stripe,
и почему нельзя было реализовать все виды отображения (mapping) - и линейное,
и stripe mapping, - используя только один вид блоков. Но я тут еще посмотрел
логи `lvcreate -vvv` и в части, относящейся к активации логического раздела,
мне кажется, я нашел ответ. Хотя не уверен, что полностью правильный :-) Вот,
например:
--- Volume group ---
VG Name vg_4k
PE Size 4.00 KB
VG UUID M7TWGx-Rggp-Okru-nWGY-c7Mh-MaLi-LhYxFm
теперь если создать логический том (размером 32000 extent-ов) с stripe
mapping, разделенный на 3 физических тома и с размером stripe-а в 4КБайта
получится:
`lvcreate -vvv -i3 -I4 -l32000 -n s_lv_3x16k vg_4k`
<..>
Creating vg_4k-s_lv_3x16k
dm create vg_4k-s_lv_3x16k
LVM-M7TWGxRggpOkrunWGYc7MhMaLiLhYxFm0YvE0mFyjZS7Q103PbM4efKUb2lUXnR8
NF [16384]
Loading vg_4k-s_lv_3x16k table
Adding target: 0 256008 striped 3 8 7:1 384 7:2 384 7:0 384
dm table (253:0) OF [16384]
dm reload (253:0) NF [16384]
Resuming vg_4k-s_lv_3x16k (253:0)
dm resume (253:0) NF [16384]
<..>
Хотя это, конечно, и без лога известно было, но все же. Т.е вся
функциональность LVM реализована через device-mapper, но device-mapper ничего
не знает об extent-ах и не использует их. В своих таблицах он (`dmsetup
table`) для всех размеров и смещений использует дисковые блоки (512байт). А
для striped таблиц он также использует stripe - блок данных, который будет
записан на одно физическое устройство (т.е стандартное определение stripe-а).
Т.е получается, что device-mapper как раз и использует всего один тип блоков -
только stripe. И получается, что extent - блок, используемый только для
удобства управления LVM томами, а при работе LVM (I/O) он не используется. Т.е
блок, используемый только LVM тулсетом. Тогда становится понятно вот это
замечание из описания опции '-s' в `man vgcreate`:
-s, --physicalextentsize PhysicalExtentSize[kKmMgGtT]
<..>
If the volume group metadata uses lvm2 format those restrictions
do not apply, but having a large number of extents will slow
down the tools but have no impact on I/O performance to the log-
ical volume. The smallest PE is 1KB.
А ограничения на размеры stripe и extent, видимо, сделаны для того, чтобы все
они друг в друге помещались: 512байт - дисковый блок - степень двойки,
поэтому, наверно, stripe и extent тоже должны быть степенью двойки, чтобы
содержали целое число дисковых блоков. Кроме того, т.к и stripe, и extent
степень двойки, extent всегда будет содержать целое количество stripe-ов.
Правда, не очень понятно, почему device-mapper не позволяет устанавливать
размер stripe-а меньше 4Кбайт.
<...>
Подписаться на:
Комментарии (Atom)
