Categories
PHP Zend

Hacking PHP to allow the redeclaration of functions. (II)

This is the next step in my adventure trying to redeclarate functions in PHP, you can see the previous effort in the linked post.

We are going again to the do_bind_function to use what we learned in the last chapter if you look at it is looks strange know since it is finding the zv key when it has still not putting a value in the table.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{   
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

Would be great to solve that mistery before getting forward breaking things. We search for that function:

sergio@bahdder ~/php-8.0.3 $ rg do_bind_function
Zend/zend_compile.h
776:ZEND_API zend_result do_bind_function(zval *lcname);

Zend/zend_vm_def.h
7589:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_vm_execute.h
2821:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_compile.c
1054:ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
1061:        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
1072:        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);

UPGRADING.INTERNALS
281:        - do_bind_function()

And then we are going to look at zend_vm_def.h what it does.

ZEND_VM_HANDLER(141, ZEND_DECLARE_FUNCTION, ANY, ANY)
{   
    USE_OPLINE
    
    SAVE_OPLINE();
    do_bind_function(RT_CONSTANT(opline, opline->op1));
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Macros everywhere…

Zend/zend_vm_execute.h
402:# define SAVE_OPLINE() EX(opline) = opline

This gives a important hint about how opline is declared, but not why the function is expected to be in the hash table.

Whatever, I am going to do this modification and restore do_bind_function_error.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{       
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}
sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

It is still happening, let’s look elsewhere…

if (toplevel) {
        if (zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL) {
            zend_hash_update_ptr(CG(function_table), lcname, op_array);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }
ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

Let’s try now…

Let’s create a file named a.php with this content:

<?php
function a() {
    print "hola";
}

And now we are going to try this:

 b/usr/local/bin/php -r 'function a(){} a(); include "a.php"; a();'
sergio@bahdder ~/php-8.0.3 $ b/usr/local/bin/php -r 'function a(){ echo "adios"; } a(); include "a.php"; a();'
adiosholasergio@bahdder ~/php-8.0.3 $ 

This appears to have worked, if I continue with this I will try to do this with a function named mock in a extension to get into extension development keeping my hands out of core.

Categories
PHP Uncategorized Zend

Hacking PHP to allow redeclaration of functions. (I)

It is often said that a developer of some language is able to write that language in any language. I can tell you this is a great true.

Currently I am in a project which uses the PHP technology and when testing it PHP does nothing but get into my way when trying to write Perl in PHP not allowing me to redeclare procedural functions.

I have seen responses in like this https://stackoverflow.com/questions/1244949/mocking-php-functions-in-unit-tests which recommend the usage of runkit, but I thought it would be could to get into the PHP code and try to implement such a thing.

The PHP code is a pretty large codebase so searching for this concrete restriction would be hard, but I have a hint given by the own PHP, let’s run this:

sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

Let’s search this message and try to get PHP do what I mean just for learning how the function declaration works in the PHP source code.

rg 'Cannot redeclare'

Ups, too much output… Let’s try again stripping some directories from the rg used in test which right now are not interesting.

rg 'Cannot redeclare' --glob '!Zend/tests' --glob '!tests'

That’s is better:

Zend/zend_compile.c
1065:           zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
1070:           zend_error_noreturn(error_level, "Cannot redeclare %s()",
6499:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
6817:           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::%s()",
7064:                   zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
7658:                           "Cannot redeclare constant '%s'", ZSTR_VAL(unqualified_name));

Zend/zend_inheritance.c
1038:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s%s::$%s as %s%s::$%s",

ext/opcache/zend_accelerator_util_funcs.c
468:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
473:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));
512:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
517:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));

That gives me a much better hint, Zend/zend_compile.c and ext/opcache/zend_accelerator_util_funcs.c can be the cause of this, let’s see how them work starting by Zend/zend_compile.c.

I got that function:

static zend_never_inline ZEND_COLD ZEND_NORETURN void do_bind_function_error(zend_string *lcname, zend_op_array *op_array, zend_bool compile_time) /* {{{ */
{
    zval *zv = zend_hash_find_ex(compile_time ? CG(function_table) : EG(function_table), lcname, 1);
    int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
    zend_function *old_function;

    ZEND_ASSERT(zv != NULL);
    old_function = (zend_function*)Z_PTR_P(zv);
    if (old_function->type == ZEND_USER_FUNCTION
        && old_function->op_array.last > 0) {
        zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                    op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name),
                    ZSTR_VAL(old_function->op_array.filename),
                    old_function->op_array.opcodes[0].lineno);
    } else {
        zend_error_noreturn(error_level, "Cannot redeclare %s()",
            op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name));
    }
}

This function is likely called when PHP reachs an state where function redeclaration is done, but the purpose is only logging that and fail, I will delete this method to figure out what breaks in compilation, I am likely not wanting this function anymore anyway.

Shamelessly I run make -j4 knowing this is not going to compile anymore.

/home/sergio/php-8.0.3/Zend/zend_compile.c: In function ‘do_bind_function’:
/home/sergio/php-8.0.3/Zend/zend_compile.c:1063:3: warning: implicit declaration of function ‘do_bind_function_error’; did you mean ‘do_bind_function’? [-Wimplicit-function-declaration]
 1063 |   do_bind_function_error(Z_STR_P(lcname), NULL, 0);
      |   ^~~~~~~~~~~~~~~~~~~~~~
      |   do_bind_function

Ok, let’s see what is happening here.

I got into the do_bind_function it looks promising:

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

This is not Kansas anymore, a good bunch of strange macros and functions that are not familiar to me are here, like UNEXPECTED or EG, but PHP developer were enough kind to make good variable names and functions so I maybe be able to tweak a little the code.

Maybe silencing the error is enough…

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{    
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

mkdir b && make -j4 && INSTALL_ROOT=b make install

But there are more calls to do_bind_error:

    if (toplevel) {
        if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
            do_bind_function_error(lcname, op_array, 1);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }

This gives me bad feelings, I don’t think this == NULL means the key is updated anyway, maybe I supposed too much, let’s look what it does, I may have to go back and rethink all the previous changes.

static zend_always_inline void *zend_hash_add_ptr(HashTable *ht, zend_string *key, void *pData)
{
    zval tmp, *zv;

    ZVAL_PTR(&tmp, pData);
    zv = zend_hash_add(ht, key, &tmp);
    if (zv) {
        ZEND_ASSUME(Z_PTR_P(zv));
        return Z_PTR_P(zv);
    } else {
        return NULL;
    }
}

Let’s look into zend_hash_add…

Ups I made too much assumptions…

ZEND_API zval* ZEND_FASTCALL zend_hash_add(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update_ind(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE | HASH_UPDATE_INDIRECT);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_add_new(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD_NEW);
}

If there is a update it is clear that add won’t update, I also found a interesting function:

ZEND_API zval* ZEND_FASTCALL zend_hash_add_or_update(HashTable *ht, zend_string *key, zval *pData, uint32_t flag)
{
    if (flag == HASH_ADD) {
        return zend_hash_add(ht, key, pData);
    } else if (flag == HASH_ADD_NEW) {
        return zend_hash_add_new(ht, key, pData);
    } else if (flag == HASH_UPDATE) {
        return zend_hash_update(ht, key, pData);
    } else {
        ZEND_ASSERT(flag == (HASH_UPDATE|HASH_UPDATE_INDIRECT));
        return zend_hash_update_ind(ht, key, pData);
    }
}

Unfortunatelly this means the bitmask the flag contains. (I looked at it without you cannot be used to tell the hashtable to do insert or update whatever it needs, let’s look in the private method they all call to see if it is true what I am thinking.

static zend_always_inline zval *_zend_hash_str_add_or_update_i(HashTable *ht, const char *str, size_t len, zend_ulong h, zval *pData, uint32_t flag)
{
    zend_string *key;
    uint32_t nIndex;
    uint32_t idx;
    Bucket *p;

    IS_CONSISTENT(ht);
    HT_ASSERT_RC1(ht);

    if (UNEXPECTED(HT_FLAGS(ht) & (HASH_FLAG_UNINITIALIZED|HASH_FLAG_PACKED))) {
        if (EXPECTED(HT_FLAGS(ht) & HASH_FLAG_UNINITIALIZED)) {
            zend_hash_real_init_mixed(ht);
            goto add_to_hash;
        } else {
            zend_hash_packed_to_hash(ht);
        }
    } else if ((flag & HASH_ADD_NEW) == 0) {
        p = zend_hash_str_find_bucket(ht, str, len, h);

        if (p) {
            zval *data;

            if (flag & HASH_ADD) {
                if (!(flag & HASH_UPDATE_INDIRECT)) {
                    return NULL;
                }
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if (Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                    if (Z_TYPE_P(data) != IS_UNDEF) {
                        return NULL;
                    }
                } else {
                    return NULL;
                }
            } else {
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if ((flag & HASH_UPDATE_INDIRECT) && Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                }
            }
            if (ht->pDestructor) {
                ht->pDestructor(data);
            }
            ZVAL_COPY_VALUE(data, pData);
            return data;
        }
    }

    ZEND_HASH_IF_FULL_DO_RESIZE(ht);        /* If the Hash table is full, resize it */

add_to_hash:
    idx = ht->nNumUsed++;
    ht->nNumOfElements++;
    p = ht->arData + idx;
    p->key = key = zend_string_init(str, len, GC_FLAGS(ht) & IS_ARRAY_PERSISTENT);
    p->h = ZSTR_H(key) = h;
    HT_FLAGS(ht) &= ~HASH_FLAG_STATIC_KEYS;
    ZVAL_COPY_VALUE(&p->val, pData);
    nIndex = h | ht->nTableMask;
    Z_NEXT(p->val) = HT_HASH(ht, nIndex);
    HT_HASH(ht, nIndex) = HT_IDX_TO_HASH(idx);

    return &p->val;
}

This means that if I can make this function get HASH_ADD | HASH_UPDATE_INDIRECT I may be able to update the hash table, let’s continue later, this was rough, but I am starting catching concepts.