DISCLAIMER. English language used here only for compatibility (ASCII only), so any suggestions about my bad grammar (and not only it) will be greatly appreciated.

среда, 14 декабря 2011 г.

initramfs and noexec /tmp

If during creation of new initramfs by update-initramfs (called by e.g.
kernel postinst script) /tmp is mounted with 'noexec', initramfs will be
slightly different and may report errors, like

E: /scripts/local-premount/resume failed with return 255.


Here 'ramdisk', 'initramfs', 'ramfs', etc terms are used interchangeably. If
there is any difference, i don't care.
This article applies to Debian 6.


What causes this error?

This error occurs between 'mountroot' and 'bottom' break points in initramfs's
'init' (break points can be set by 'break=NAME' kernel command line parameter,
which will be parsed by initramfs's 'init'). Function mountroot() is called
between these breakpoints:

    initrd/init:
        ..
        maybe_break mountroot
        mountroot
        log_end_msg

        maybe_break bottom
        ..

This function defined in 'scripts/local' and it calls scripts from
'scripts/local-premount' directory:

    initrd/scripts/local:
        ..
        mountroot()
        {
                ...
                [ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-premount"
                run_scripts /scripts/local-premount
                [ "$quiet" != "y" ] && log_end_msg
                ..
        }

Script 'resume' from 'local-premount' directory tries to resume system from
suspend to disk state. It uses 'bin/resume' binary from klibc-utils package
for this (/usr/lib/klibc/bin/resume in the system). And.. this program indeed
exits with code -1, if it can't resume system from s2disk state (it assumes
using of swap partition for saving state, right?):

    klibc-1.5.20/usr/kinit/resume/resumelib.c:
        int resume()
        {
        ..
                dprintf("kinit: trying to resume from %s\n", resume_file);

                if (write(powerfd, device_string, len) != len)
                        goto fail_r;

                /* Okay, what are we still doing alive... */
        failure:
                if (powerfd >= 0)
                        close(powerfd);
                dprintf("kinit: No resume image, doing normal boot...\n");
                return -1;
        ..
        }

(powerfd is /sys/power/resume file).


Who prints the error?

Scripts from /scripts directory in initramfs may be called using two different
methods: through ORDER file or by run_scripts() function from
'scripts/functions' file (/usr/share/initramfs-tools/scripts/functions in the
system):

    initrd/scripts/functions:
        run_scripts()
        {
                initdir=${1}
                [ ! -d ${initdir} ] && return

                if [ -f ${initdir}/ORDER ]; then
                        . ${initdir}/ORDER
                elif command -v tsort >/dev/null 2>&1; then
                        runlist=$(get_prereq_pairs | tsort)
                        call_scripts ${2:-}
                else
                        get_prereqs
                        reduce_prereqs
                        call_scripts
                fi
        }

The second method call call_scripts() function, which checks return code and
prints above error message:

    initrd/scripts/functions:
        call_scripts()
        {
        ..
            ${initdir}/${cs_x} && ec=$? || ec=$?
            # allow hooks to abort build:
            if [ "$ec" -ne 0 ]; then
                    echo "E: ${initdir}/${cs_x} failed with return $ec."
                    # only errexit on mkinitramfs
                    [ -n "${version}" ] && exit $ec
            fi
        ..
        }

The first method, though, does exactly what in ORDER file is written, and
there is no checks of return code:

    scripts/local-premount/ORDER:
        /scripts/local-premount/resume
        [ -e /conf/param.conf ] && . /conf/param.conf


Why with some initramfs error is printed and with some not?

This is because sometimes (see below) ORDER files are generated and packed
into initramfs, but sometimes - are not. This depends on how mkinitramfs runs.


Why update-initramfs generates different ramdisks depending on 'noexec' /tmp?

'update-initramfs' calls 'mkinitramfs', and all the work we're intrested in
happens there. mkinitramfs will not call function cache_run_scripts() from
'/usr/share/initramfs-tools/hook-functions', if /tmp is mounted with 'noexec':

    /usr/sbin/mkinitramfs:
        ..
        if [ -n "$NOEXEC" ]; then
                echo "W: TMPDIR is mounted noexec, will not cache run scripts."
        else
                for b in $(cd "${DESTDIR}/scripts" && find . -mindepth 1 -type d); do
                        cache_run_scripts "${DESTDIR}" "/scripts/${b#./}"
                done
        fi
        ..

But this function is exactly the one who creates ORDER files:

    /usr/share/initramfs-tools/hook-functions:
        cache_run_scripts()
        {
        ..
            for crs_x in ${runlist}; do
                    [ -f ${initdir}/${crs_x} ] || continue
                    echo "${scriptdir}/${crs_x}" >> ${initdir}/ORDER
                    echo "[ -e /conf/param.conf ] && . /conf/param.conf" >> ${initdir}/ORDER
            done
        ..
        }


How to fix?

I don't know good fix for this. Perhaps, the better one will be remount /tmp
with 'exec' each time you want to update initramfs or kernel.

The first problem is to specify 'exec' location to 'mkinitramfs'. Probably
'/etc/initramfs-tools/initramfs.conf' can be used, though such usage is of
course not documented. mkinitramfs will include this file:

    /usr/sbin/mkinitramfs:
        . "${CONFDIR}/initramfs.conf"

and you can add 'TMPDIR=DIR_WITH_EXEC' in it, but this file will be also
placed into ramdisk and sourced by init

    initrd/init:
        # Bring in the main config
        . /conf/initramfs.conf

and i don't know for sure will there be any consequences from such change or
not.  But anyway this is not all. The second problem is how 'mkinitramfs'
checks 'noexec' option:

    /usr/sbin/mkinitramfs:
        ..
        NOEXEC=""
        fs=$(df -P $DESTDIR | tail -1 | awk '{print $6}')
        if [ -n "$fs" ] && mount | grep -q "on $fs .*noexec" ; then
                NOEXEC=1
        fi
        ..

with such check bind-mounting of directory $TMPDIR located on 'noexec'
partition and remounting it with 'exec' will not work. In other words, 6th
columnt of `df -P` output contains mount point of entire partition and then
mount options for entire partition are checked, but not for the exactly
$TMPDIR, which can be bind-mounted.  Hence, you should place $TMPDIR on 'exec'
partition.

And after all this, do you still want to redefine $TMPDIR? -)

Комментариев нет:

Отправить комментарий