Categories
PHP Uncategorized Zend

Hacking PHP to allow redeclaration of functions. (I)

It is often said that a developer of some language is able to write that language in any language. I can tell you this is a great true.

Currently I am in a project which uses the PHP technology and when testing it PHP does nothing but get into my way when trying to write Perl in PHP not allowing me to redeclare procedural functions.

I have seen responses in like this https://stackoverflow.com/questions/1244949/mocking-php-functions-in-unit-tests which recommend the usage of runkit, but I thought it would be could to get into the PHP code and try to implement such a thing.

The PHP code is a pretty large codebase so searching for this concrete restriction would be hard, but I have a hint given by the own PHP, let’s run this:

sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

Let’s search this message and try to get PHP do what I mean just for learning how the function declaration works in the PHP source code.

rg 'Cannot redeclare'

Ups, too much output… Let’s try again stripping some directories from the rg used in test which right now are not interesting.

rg 'Cannot redeclare' --glob '!Zend/tests' --glob '!tests'

That’s is better:

Zend/zend_compile.c
1065:           zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
1070:           zend_error_noreturn(error_level, "Cannot redeclare %s()",
6499:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
6817:           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::%s()",
7064:                   zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
7658:                           "Cannot redeclare constant '%s'", ZSTR_VAL(unqualified_name));

Zend/zend_inheritance.c
1038:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s%s::$%s as %s%s::$%s",

ext/opcache/zend_accelerator_util_funcs.c
468:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
473:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));
512:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
517:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));

That gives me a much better hint, Zend/zend_compile.c and ext/opcache/zend_accelerator_util_funcs.c can be the cause of this, let’s see how them work starting by Zend/zend_compile.c.

I got that function:

static zend_never_inline ZEND_COLD ZEND_NORETURN void do_bind_function_error(zend_string *lcname, zend_op_array *op_array, zend_bool compile_time) /* {{{ */
{
    zval *zv = zend_hash_find_ex(compile_time ? CG(function_table) : EG(function_table), lcname, 1);
    int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
    zend_function *old_function;

    ZEND_ASSERT(zv != NULL);
    old_function = (zend_function*)Z_PTR_P(zv);
    if (old_function->type == ZEND_USER_FUNCTION
        && old_function->op_array.last > 0) {
        zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                    op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name),
                    ZSTR_VAL(old_function->op_array.filename),
                    old_function->op_array.opcodes[0].lineno);
    } else {
        zend_error_noreturn(error_level, "Cannot redeclare %s()",
            op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name));
    }
}

This function is likely called when PHP reachs an state where function redeclaration is done, but the purpose is only logging that and fail, I will delete this method to figure out what breaks in compilation, I am likely not wanting this function anymore anyway.

Shamelessly I run make -j4 knowing this is not going to compile anymore.

/home/sergio/php-8.0.3/Zend/zend_compile.c: In function ‘do_bind_function’:
/home/sergio/php-8.0.3/Zend/zend_compile.c:1063:3: warning: implicit declaration of function ‘do_bind_function_error’; did you mean ‘do_bind_function’? [-Wimplicit-function-declaration]
 1063 |   do_bind_function_error(Z_STR_P(lcname), NULL, 0);
      |   ^~~~~~~~~~~~~~~~~~~~~~
      |   do_bind_function

Ok, let’s see what is happening here.

I got into the do_bind_function it looks promising:

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

This is not Kansas anymore, a good bunch of strange macros and functions that are not familiar to me are here, like UNEXPECTED or EG, but PHP developer were enough kind to make good variable names and functions so I maybe be able to tweak a little the code.

Maybe silencing the error is enough…

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{    
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

mkdir b && make -j4 && INSTALL_ROOT=b make install

But there are more calls to do_bind_error:

    if (toplevel) {
        if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
            do_bind_function_error(lcname, op_array, 1);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }

This gives me bad feelings, I don’t think this == NULL means the key is updated anyway, maybe I supposed too much, let’s look what it does, I may have to go back and rethink all the previous changes.

static zend_always_inline void *zend_hash_add_ptr(HashTable *ht, zend_string *key, void *pData)
{
    zval tmp, *zv;

    ZVAL_PTR(&tmp, pData);
    zv = zend_hash_add(ht, key, &tmp);
    if (zv) {
        ZEND_ASSUME(Z_PTR_P(zv));
        return Z_PTR_P(zv);
    } else {
        return NULL;
    }
}

Let’s look into zend_hash_add…

Ups I made too much assumptions…

ZEND_API zval* ZEND_FASTCALL zend_hash_add(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update_ind(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE | HASH_UPDATE_INDIRECT);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_add_new(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD_NEW);
}

If there is a update it is clear that add won’t update, I also found a interesting function:

ZEND_API zval* ZEND_FASTCALL zend_hash_add_or_update(HashTable *ht, zend_string *key, zval *pData, uint32_t flag)
{
    if (flag == HASH_ADD) {
        return zend_hash_add(ht, key, pData);
    } else if (flag == HASH_ADD_NEW) {
        return zend_hash_add_new(ht, key, pData);
    } else if (flag == HASH_UPDATE) {
        return zend_hash_update(ht, key, pData);
    } else {
        ZEND_ASSERT(flag == (HASH_UPDATE|HASH_UPDATE_INDIRECT));
        return zend_hash_update_ind(ht, key, pData);
    }
}

Unfortunatelly this means the bitmask the flag contains. (I looked at it without you cannot be used to tell the hashtable to do insert or update whatever it needs, let’s look in the private method they all call to see if it is true what I am thinking.

static zend_always_inline zval *_zend_hash_str_add_or_update_i(HashTable *ht, const char *str, size_t len, zend_ulong h, zval *pData, uint32_t flag)
{
    zend_string *key;
    uint32_t nIndex;
    uint32_t idx;
    Bucket *p;

    IS_CONSISTENT(ht);
    HT_ASSERT_RC1(ht);

    if (UNEXPECTED(HT_FLAGS(ht) & (HASH_FLAG_UNINITIALIZED|HASH_FLAG_PACKED))) {
        if (EXPECTED(HT_FLAGS(ht) & HASH_FLAG_UNINITIALIZED)) {
            zend_hash_real_init_mixed(ht);
            goto add_to_hash;
        } else {
            zend_hash_packed_to_hash(ht);
        }
    } else if ((flag & HASH_ADD_NEW) == 0) {
        p = zend_hash_str_find_bucket(ht, str, len, h);

        if (p) {
            zval *data;

            if (flag & HASH_ADD) {
                if (!(flag & HASH_UPDATE_INDIRECT)) {
                    return NULL;
                }
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if (Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                    if (Z_TYPE_P(data) != IS_UNDEF) {
                        return NULL;
                    }
                } else {
                    return NULL;
                }
            } else {
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if ((flag & HASH_UPDATE_INDIRECT) && Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                }
            }
            if (ht->pDestructor) {
                ht->pDestructor(data);
            }
            ZVAL_COPY_VALUE(data, pData);
            return data;
        }
    }

    ZEND_HASH_IF_FULL_DO_RESIZE(ht);        /* If the Hash table is full, resize it */

add_to_hash:
    idx = ht->nNumUsed++;
    ht->nNumOfElements++;
    p = ht->arData + idx;
    p->key = key = zend_string_init(str, len, GC_FLAGS(ht) & IS_ARRAY_PERSISTENT);
    p->h = ZSTR_H(key) = h;
    HT_FLAGS(ht) &= ~HASH_FLAG_STATIC_KEYS;
    ZVAL_COPY_VALUE(&p->val, pData);
    nIndex = h | ht->nTableMask;
    Z_NEXT(p->val) = HT_HASH(ht, nIndex);
    HT_HASH(ht, nIndex) = HT_IDX_TO_HASH(idx);

    return &p->val;
}

This means that if I can make this function get HASH_ADD | HASH_UPDATE_INDIRECT I may be able to update the hash table, let’s continue later, this was rough, but I am starting catching concepts.

By sergiotarxz

I am a software developer with high interest on free software.

One reply on “Hacking PHP to allow redeclaration of functions. (I)”

Leave a Reply

Your email address will not be published. Required fields are marked *