Categories
notest Perl PHP

Creating reproducible tests in pull requests so reviewers can know the code is properly working when your codebase lacks unit tests.

Everyone who has worked before with unit testing will know that pull request based testing is highly inefficient on the sense tests are going to be lost once the pull request is approved. But meanwhile automated testing is implemented in a legacy project code has still to be done to implement new features or bugfixes.

That is where the guideline I am going to expose has it’s niche, trying to get the job done while preserving a sane behaviour in the code.

This guide assumes a web based project, some parts may have to be adapted on discrection for other use cases.

General Guidelines.

  • Avoid high level instructions that can be written as commands.

It is usually the best to avoid saying the reviewers to do something that they may have to investigate when you can write some fancy command that does it. Example:

Bad example:

“Block the black haired users from having a avatar.”

Good example:

“Execute the following sql query to block the black haired users from having a avatar.”

update set avatar_blocked=1 from users where hair = 'black';

Other bad example:

Remove lines from 500-550 from lib/Users/BlockAvatar.php since they attempt to connect to a external ftp and are going to generate an error and set $got_external_csv to 0.

Good example:

Remove lines from 500-550 from lib/Users/BlockAvatar.php since they attempt to connect to a external ftp and are going to generate an error with this command and set $got_external_csv to 0:

perl <( cat << 'EOF'
use 5.30.0;
my $i = 0;
while (<>) {
    $i++; 
    next if $i >= 500 && $i <= 550;
    say << 'END_OF_SAY' if $i == 551;
        # Temporal fix to avoid ftp connections in testing.
        $got_external_csv = 0;
END_OF_SAY
    print;
}
EOF
) lib/Users/BlockAvatar.php > block_avatar_tmp.php
cp block_avatar_tmp.php lib/Users/BlockAvatar.php
rm block_avatar_tmp.php
  • Avoid to write a database query as a bash command and write the query directly so everyone can use whatever database client they find more comfortable with.

Bad example:

echo 'select * from users' | mysql

Good example:

select * from users;
  • Avoid to write operations in the webpage that users are supposed to do as bash commands and instead send the reviewers to do those operations unless it is really needed.

This is mainly because two reasons, if you bindly copy as Posix Curl a Firefox request you have chances to collide against csfr tokens or leak your authentication cookies, not a good deal, also you have chances that if you broke something in frontend in your changes it gets unnoticed in the pull request.

Bad example:

“Delete the Luffy user”

curl -X DELETE www.myweb.com/api/user/luffy

Good example:

Go to https://www.myweb.com/admin/manage_user?user=luffy and press delete this user.

  • Include screenshots of GUI steps if possible indicating where should be reviewer interact with the webpage and how.

Those screenshots should not be an alternative against text description but a complement to avoid blind reviewer discrimination, it should be a help for people with visual minds, not a diversity killer.

Categories
PHP Zend

Hacking PHP to allow the redeclaration of functions. (II)

This is the next step in my adventure trying to redeclarate functions in PHP, you can see the previous effort in the linked post.

We are going again to the do_bind_function to use what we learned in the last chapter if you look at it is looks strange know since it is finding the zv key when it has still not putting a value in the table.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{   
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

Would be great to solve that mistery before getting forward breaking things. We search for that function:

sergio@bahdder ~/php-8.0.3 $ rg do_bind_function
Zend/zend_compile.h
776:ZEND_API zend_result do_bind_function(zval *lcname);

Zend/zend_vm_def.h
7589:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_vm_execute.h
2821:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_compile.c
1054:ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
1061:        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
1072:        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);

UPGRADING.INTERNALS
281:        - do_bind_function()

And then we are going to look at zend_vm_def.h what it does.

ZEND_VM_HANDLER(141, ZEND_DECLARE_FUNCTION, ANY, ANY)
{   
    USE_OPLINE
    
    SAVE_OPLINE();
    do_bind_function(RT_CONSTANT(opline, opline->op1));
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Macros everywhere…

Zend/zend_vm_execute.h
402:# define SAVE_OPLINE() EX(opline) = opline

This gives a important hint about how opline is declared, but not why the function is expected to be in the hash table.

Whatever, I am going to do this modification and restore do_bind_function_error.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{       
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}
sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

It is still happening, let’s look elsewhere…

if (toplevel) {
        if (zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL) {
            zend_hash_update_ptr(CG(function_table), lcname, op_array);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }
ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

Let’s try now…

Let’s create a file named a.php with this content:

<?php
function a() {
    print "hola";
}

And now we are going to try this:

 b/usr/local/bin/php -r 'function a(){} a(); include "a.php"; a();'
sergio@bahdder ~/php-8.0.3 $ b/usr/local/bin/php -r 'function a(){ echo "adios"; } a(); include "a.php"; a();'
adiosholasergio@bahdder ~/php-8.0.3 $ 

This appears to have worked, if I continue with this I will try to do this with a function named mock in a extension to get into extension development keeping my hands out of core.

Categories
PHP Uncategorized Zend

Hacking PHP to allow redeclaration of functions. (I)

It is often said that a developer of some language is able to write that language in any language. I can tell you this is a great true.

Currently I am in a project which uses the PHP technology and when testing it PHP does nothing but get into my way when trying to write Perl in PHP not allowing me to redeclare procedural functions.

I have seen responses in like this https://stackoverflow.com/questions/1244949/mocking-php-functions-in-unit-tests which recommend the usage of runkit, but I thought it would be could to get into the PHP code and try to implement such a thing.

The PHP code is a pretty large codebase so searching for this concrete restriction would be hard, but I have a hint given by the own PHP, let’s run this:

sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

Let’s search this message and try to get PHP do what I mean just for learning how the function declaration works in the PHP source code.

rg 'Cannot redeclare'

Ups, too much output… Let’s try again stripping some directories from the rg used in test which right now are not interesting.

rg 'Cannot redeclare' --glob '!Zend/tests' --glob '!tests'

That’s is better:

Zend/zend_compile.c
1065:           zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
1070:           zend_error_noreturn(error_level, "Cannot redeclare %s()",
6499:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
6817:           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::%s()",
7064:                   zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
7658:                           "Cannot redeclare constant '%s'", ZSTR_VAL(unqualified_name));

Zend/zend_inheritance.c
1038:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s%s::$%s as %s%s::$%s",

ext/opcache/zend_accelerator_util_funcs.c
468:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
473:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));
512:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
517:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));

That gives me a much better hint, Zend/zend_compile.c and ext/opcache/zend_accelerator_util_funcs.c can be the cause of this, let’s see how them work starting by Zend/zend_compile.c.

I got that function:

static zend_never_inline ZEND_COLD ZEND_NORETURN void do_bind_function_error(zend_string *lcname, zend_op_array *op_array, zend_bool compile_time) /* {{{ */
{
    zval *zv = zend_hash_find_ex(compile_time ? CG(function_table) : EG(function_table), lcname, 1);
    int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
    zend_function *old_function;

    ZEND_ASSERT(zv != NULL);
    old_function = (zend_function*)Z_PTR_P(zv);
    if (old_function->type == ZEND_USER_FUNCTION
        && old_function->op_array.last > 0) {
        zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                    op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name),
                    ZSTR_VAL(old_function->op_array.filename),
                    old_function->op_array.opcodes[0].lineno);
    } else {
        zend_error_noreturn(error_level, "Cannot redeclare %s()",
            op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name));
    }
}

This function is likely called when PHP reachs an state where function redeclaration is done, but the purpose is only logging that and fail, I will delete this method to figure out what breaks in compilation, I am likely not wanting this function anymore anyway.

Shamelessly I run make -j4 knowing this is not going to compile anymore.

/home/sergio/php-8.0.3/Zend/zend_compile.c: In function ‘do_bind_function’:
/home/sergio/php-8.0.3/Zend/zend_compile.c:1063:3: warning: implicit declaration of function ‘do_bind_function_error’; did you mean ‘do_bind_function’? [-Wimplicit-function-declaration]
 1063 |   do_bind_function_error(Z_STR_P(lcname), NULL, 0);
      |   ^~~~~~~~~~~~~~~~~~~~~~
      |   do_bind_function

Ok, let’s see what is happening here.

I got into the do_bind_function it looks promising:

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

This is not Kansas anymore, a good bunch of strange macros and functions that are not familiar to me are here, like UNEXPECTED or EG, but PHP developer were enough kind to make good variable names and functions so I maybe be able to tweak a little the code.

Maybe silencing the error is enough…

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{    
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

mkdir b && make -j4 && INSTALL_ROOT=b make install

But there are more calls to do_bind_error:

    if (toplevel) {
        if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
            do_bind_function_error(lcname, op_array, 1);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }

This gives me bad feelings, I don’t think this == NULL means the key is updated anyway, maybe I supposed too much, let’s look what it does, I may have to go back and rethink all the previous changes.

static zend_always_inline void *zend_hash_add_ptr(HashTable *ht, zend_string *key, void *pData)
{
    zval tmp, *zv;

    ZVAL_PTR(&tmp, pData);
    zv = zend_hash_add(ht, key, &tmp);
    if (zv) {
        ZEND_ASSUME(Z_PTR_P(zv));
        return Z_PTR_P(zv);
    } else {
        return NULL;
    }
}

Let’s look into zend_hash_add…

Ups I made too much assumptions…

ZEND_API zval* ZEND_FASTCALL zend_hash_add(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update_ind(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE | HASH_UPDATE_INDIRECT);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_add_new(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD_NEW);
}

If there is a update it is clear that add won’t update, I also found a interesting function:

ZEND_API zval* ZEND_FASTCALL zend_hash_add_or_update(HashTable *ht, zend_string *key, zval *pData, uint32_t flag)
{
    if (flag == HASH_ADD) {
        return zend_hash_add(ht, key, pData);
    } else if (flag == HASH_ADD_NEW) {
        return zend_hash_add_new(ht, key, pData);
    } else if (flag == HASH_UPDATE) {
        return zend_hash_update(ht, key, pData);
    } else {
        ZEND_ASSERT(flag == (HASH_UPDATE|HASH_UPDATE_INDIRECT));
        return zend_hash_update_ind(ht, key, pData);
    }
}

Unfortunatelly this means the bitmask the flag contains. (I looked at it without you cannot be used to tell the hashtable to do insert or update whatever it needs, let’s look in the private method they all call to see if it is true what I am thinking.

static zend_always_inline zval *_zend_hash_str_add_or_update_i(HashTable *ht, const char *str, size_t len, zend_ulong h, zval *pData, uint32_t flag)
{
    zend_string *key;
    uint32_t nIndex;
    uint32_t idx;
    Bucket *p;

    IS_CONSISTENT(ht);
    HT_ASSERT_RC1(ht);

    if (UNEXPECTED(HT_FLAGS(ht) & (HASH_FLAG_UNINITIALIZED|HASH_FLAG_PACKED))) {
        if (EXPECTED(HT_FLAGS(ht) & HASH_FLAG_UNINITIALIZED)) {
            zend_hash_real_init_mixed(ht);
            goto add_to_hash;
        } else {
            zend_hash_packed_to_hash(ht);
        }
    } else if ((flag & HASH_ADD_NEW) == 0) {
        p = zend_hash_str_find_bucket(ht, str, len, h);

        if (p) {
            zval *data;

            if (flag & HASH_ADD) {
                if (!(flag & HASH_UPDATE_INDIRECT)) {
                    return NULL;
                }
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if (Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                    if (Z_TYPE_P(data) != IS_UNDEF) {
                        return NULL;
                    }
                } else {
                    return NULL;
                }
            } else {
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if ((flag & HASH_UPDATE_INDIRECT) && Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                }
            }
            if (ht->pDestructor) {
                ht->pDestructor(data);
            }
            ZVAL_COPY_VALUE(data, pData);
            return data;
        }
    }

    ZEND_HASH_IF_FULL_DO_RESIZE(ht);        /* If the Hash table is full, resize it */

add_to_hash:
    idx = ht->nNumUsed++;
    ht->nNumOfElements++;
    p = ht->arData + idx;
    p->key = key = zend_string_init(str, len, GC_FLAGS(ht) & IS_ARRAY_PERSISTENT);
    p->h = ZSTR_H(key) = h;
    HT_FLAGS(ht) &= ~HASH_FLAG_STATIC_KEYS;
    ZVAL_COPY_VALUE(&p->val, pData);
    nIndex = h | ht->nTableMask;
    Z_NEXT(p->val) = HT_HASH(ht, nIndex);
    HT_HASH(ht, nIndex) = HT_IDX_TO_HASH(idx);

    return &p->val;
}

This means that if I can make this function get HASH_ADD | HASH_UPDATE_INDIRECT I may be able to update the hash table, let’s continue later, this was rough, but I am starting catching concepts.

Categories
Gentoo Nextcloud PHP

Why Gentoo GNU/Linux rules (For me) as PHP web development environment?

Usually clients do not have the latest PHP version installed and this is when Gentoo becomes handy thanks to the Portage package manager Slot system, lets look it in deep…

Currently Gentoo packages different versions of PHP like 7.2, 7.3, 7.4 and 8.0 and thanks to the slot system you can have more than one installed, but thats not all, you can also “clone” the /etc/init.d/php-fpm putting a score and the name of the version you want this new php-fpm instance to execute so having a multiversion development environment is no longer hard.

Also this capacities Gentoo comes with are pretty handy for servers allowing you to create a new init service with a especific version isolated in a user if you think some service is more critical at security or performance level than the others giving you such a powerful server environment too. I use this with my Nextcloud instance so the data cannot be accesed or manipulated from other PHP services.

Installing Gentoo the first time is never easy, but it provides such a powerful and customizable operative system so the effort in learning how to use that distribution of GNU/Linux is not a wasted effort.