Categories
git GNU/Linux ssh Windows

Dealing with SSH key management in a mixed Windows and GNU/Linux environment with WSL.

I am going to drop a bunch of tips which may be useful to people using SSH in a extensive way for development to authenticate against their remote git user in say Gitlab, Gitea, Github or Bitbucket and happens to not being able to work in the best operative system for development.

The first thing is SSH key generation, you surely would not like to have to update your SSH key in the remote Git server because you accidentally generated a new pair of keys so you should be careful while typing the ssh-keygen, you should do it before submiting your key to the remote server and NEVER again.

Choose carefully the root of your ssh keys in every computer and put yourself the rule of never overriding that SSH key pair to avoid losing accidentally the access to the remote repository.

That said you install your GNU/Linux distribution of choice, generate a key pair and submit the .ssh/id_rsa.pub to the remote repository and then you should be able to work from your WSL user cloning repositories, but then you want to clone the repository in for example a directory owned by www-data.

You may try:

cd /var/www/
sudo git clone ssh://user@host/myfancyrepo

That will give you a beautiful permission denied and you may stick confused because of this, but you should be aware ssh keys are private for every user, so if you think about it, it’s simply logic root has different ssh keys than your user.

A correct approach would be:

sudo mkdir ~root/.ssh
sudo cp ~/.ssh/id_rsa{,.pub} ~root/.ssh
sudo git clone ssh://user@host/myfancyrepo

But correct is not good, a better approach would be to do it with the user owner of the folder like this since when the directory with the repository is created if it needs some sort of installation, say composer or npm you will be tempted to do that installation with root which is absolutely discouraged.

sudo mkdir ~www-data/.ssh
sudo cp ~/.ssh/id_rsa{,.pub} ~www-data/.ssh/
sudo -u www-data git clone ssh://user@host/myfancyrepo

If you are too often having to use git repositories outside your WSL environment in Windows folders, you will find soon how slow it can get, I recommend you to install the Git for Windows and ensure it is in the Windows PATH and then add this lines to your .bashrc:

export WIN_DIR=/mnt/c
export GIT=$(which git)

git_wrapper() {
    if perl -e 'exit int($ARGV[1] !~ /^@{[$ARGV[0]]}(?:\/|$)/)' $WIN_DIR $PWD; then
        git.exe $@ 
    else  
        $GIT $@
    fi      
}

alias git=git_wrapper 

You will need to copy your SSH keys to the Windows user like this:

sudo mkdir -pv /mnt/c/Users/<youruser>/.ssh/
sudo cp ~/.ssh/{id_rsa{,.pub} /mnt/c/Users/<youruser>/.ssh/

This should allow you to painlessly be able to use wsl to work with both WSL repositories and Windows ones.

This post is a compilation of common issues I have been seen and suffered during the development in Windows in my job and I hope it helps somebody else to do not fall in some traps to the newbies leading with SSH key management.

Categories
PHP Zend

Hacking PHP to allow the redeclaration of functions. (II)

This is the next step in my adventure trying to redeclarate functions in PHP, you can see the previous effort in the linked post.

We are going again to the do_bind_function to use what we learned in the last chapter if you look at it is looks strange know since it is finding the zv key when it has still not putting a value in the table.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{   
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

Would be great to solve that mistery before getting forward breaking things. We search for that function:

sergio@bahdder ~/php-8.0.3 $ rg do_bind_function
Zend/zend_compile.h
776:ZEND_API zend_result do_bind_function(zval *lcname);

Zend/zend_vm_def.h
7589:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_vm_execute.h
2821:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_compile.c
1054:ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
1061:        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
1072:        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);

UPGRADING.INTERNALS
281:        - do_bind_function()

And then we are going to look at zend_vm_def.h what it does.

ZEND_VM_HANDLER(141, ZEND_DECLARE_FUNCTION, ANY, ANY)
{   
    USE_OPLINE
    
    SAVE_OPLINE();
    do_bind_function(RT_CONSTANT(opline, opline->op1));
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Macros everywhere…

Zend/zend_vm_execute.h
402:# define SAVE_OPLINE() EX(opline) = opline

This gives a important hint about how opline is declared, but not why the function is expected to be in the hash table.

Whatever, I am going to do this modification and restore do_bind_function_error.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{       
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}
sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

It is still happening, let’s look elsewhere…

if (toplevel) {
        if (zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL) {
            zend_hash_update_ptr(CG(function_table), lcname, op_array);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }
ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

Let’s try now…

Let’s create a file named a.php with this content:

<?php
function a() {
    print "hola";
}

And now we are going to try this:

 b/usr/local/bin/php -r 'function a(){} a(); include "a.php"; a();'
sergio@bahdder ~/php-8.0.3 $ b/usr/local/bin/php -r 'function a(){ echo "adios"; } a(); include "a.php"; a();'
adiosholasergio@bahdder ~/php-8.0.3 $ 

This appears to have worked, if I continue with this I will try to do this with a function named mock in a extension to get into extension development keeping my hands out of core.

Categories
PHP Uncategorized Zend

Hacking PHP to allow redeclaration of functions. (I)

It is often said that a developer of some language is able to write that language in any language. I can tell you this is a great true.

Currently I am in a project which uses the PHP technology and when testing it PHP does nothing but get into my way when trying to write Perl in PHP not allowing me to redeclare procedural functions.

I have seen responses in like this https://stackoverflow.com/questions/1244949/mocking-php-functions-in-unit-tests which recommend the usage of runkit, but I thought it would be could to get into the PHP code and try to implement such a thing.

The PHP code is a pretty large codebase so searching for this concrete restriction would be hard, but I have a hint given by the own PHP, let’s run this:

sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

Let’s search this message and try to get PHP do what I mean just for learning how the function declaration works in the PHP source code.

rg 'Cannot redeclare'

Ups, too much output… Let’s try again stripping some directories from the rg used in test which right now are not interesting.

rg 'Cannot redeclare' --glob '!Zend/tests' --glob '!tests'

That’s is better:

Zend/zend_compile.c
1065:           zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
1070:           zend_error_noreturn(error_level, "Cannot redeclare %s()",
6499:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
6817:           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::%s()",
7064:                   zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
7658:                           "Cannot redeclare constant '%s'", ZSTR_VAL(unqualified_name));

Zend/zend_inheritance.c
1038:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s%s::$%s as %s%s::$%s",

ext/opcache/zend_accelerator_util_funcs.c
468:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
473:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));
512:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
517:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));

That gives me a much better hint, Zend/zend_compile.c and ext/opcache/zend_accelerator_util_funcs.c can be the cause of this, let’s see how them work starting by Zend/zend_compile.c.

I got that function:

static zend_never_inline ZEND_COLD ZEND_NORETURN void do_bind_function_error(zend_string *lcname, zend_op_array *op_array, zend_bool compile_time) /* {{{ */
{
    zval *zv = zend_hash_find_ex(compile_time ? CG(function_table) : EG(function_table), lcname, 1);
    int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
    zend_function *old_function;

    ZEND_ASSERT(zv != NULL);
    old_function = (zend_function*)Z_PTR_P(zv);
    if (old_function->type == ZEND_USER_FUNCTION
        && old_function->op_array.last > 0) {
        zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                    op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name),
                    ZSTR_VAL(old_function->op_array.filename),
                    old_function->op_array.opcodes[0].lineno);
    } else {
        zend_error_noreturn(error_level, "Cannot redeclare %s()",
            op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name));
    }
}

This function is likely called when PHP reachs an state where function redeclaration is done, but the purpose is only logging that and fail, I will delete this method to figure out what breaks in compilation, I am likely not wanting this function anymore anyway.

Shamelessly I run make -j4 knowing this is not going to compile anymore.

/home/sergio/php-8.0.3/Zend/zend_compile.c: In function ‘do_bind_function’:
/home/sergio/php-8.0.3/Zend/zend_compile.c:1063:3: warning: implicit declaration of function ‘do_bind_function_error’; did you mean ‘do_bind_function’? [-Wimplicit-function-declaration]
 1063 |   do_bind_function_error(Z_STR_P(lcname), NULL, 0);
      |   ^~~~~~~~~~~~~~~~~~~~~~
      |   do_bind_function

Ok, let’s see what is happening here.

I got into the do_bind_function it looks promising:

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

This is not Kansas anymore, a good bunch of strange macros and functions that are not familiar to me are here, like UNEXPECTED or EG, but PHP developer were enough kind to make good variable names and functions so I maybe be able to tweak a little the code.

Maybe silencing the error is enough…

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{    
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

mkdir b && make -j4 && INSTALL_ROOT=b make install

But there are more calls to do_bind_error:

    if (toplevel) {
        if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
            do_bind_function_error(lcname, op_array, 1);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }

This gives me bad feelings, I don’t think this == NULL means the key is updated anyway, maybe I supposed too much, let’s look what it does, I may have to go back and rethink all the previous changes.

static zend_always_inline void *zend_hash_add_ptr(HashTable *ht, zend_string *key, void *pData)
{
    zval tmp, *zv;

    ZVAL_PTR(&tmp, pData);
    zv = zend_hash_add(ht, key, &tmp);
    if (zv) {
        ZEND_ASSUME(Z_PTR_P(zv));
        return Z_PTR_P(zv);
    } else {
        return NULL;
    }
}

Let’s look into zend_hash_add…

Ups I made too much assumptions…

ZEND_API zval* ZEND_FASTCALL zend_hash_add(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update_ind(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE | HASH_UPDATE_INDIRECT);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_add_new(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD_NEW);
}

If there is a update it is clear that add won’t update, I also found a interesting function:

ZEND_API zval* ZEND_FASTCALL zend_hash_add_or_update(HashTable *ht, zend_string *key, zval *pData, uint32_t flag)
{
    if (flag == HASH_ADD) {
        return zend_hash_add(ht, key, pData);
    } else if (flag == HASH_ADD_NEW) {
        return zend_hash_add_new(ht, key, pData);
    } else if (flag == HASH_UPDATE) {
        return zend_hash_update(ht, key, pData);
    } else {
        ZEND_ASSERT(flag == (HASH_UPDATE|HASH_UPDATE_INDIRECT));
        return zend_hash_update_ind(ht, key, pData);
    }
}

Unfortunatelly this means the bitmask the flag contains. (I looked at it without you cannot be used to tell the hashtable to do insert or update whatever it needs, let’s look in the private method they all call to see if it is true what I am thinking.

static zend_always_inline zval *_zend_hash_str_add_or_update_i(HashTable *ht, const char *str, size_t len, zend_ulong h, zval *pData, uint32_t flag)
{
    zend_string *key;
    uint32_t nIndex;
    uint32_t idx;
    Bucket *p;

    IS_CONSISTENT(ht);
    HT_ASSERT_RC1(ht);

    if (UNEXPECTED(HT_FLAGS(ht) & (HASH_FLAG_UNINITIALIZED|HASH_FLAG_PACKED))) {
        if (EXPECTED(HT_FLAGS(ht) & HASH_FLAG_UNINITIALIZED)) {
            zend_hash_real_init_mixed(ht);
            goto add_to_hash;
        } else {
            zend_hash_packed_to_hash(ht);
        }
    } else if ((flag & HASH_ADD_NEW) == 0) {
        p = zend_hash_str_find_bucket(ht, str, len, h);

        if (p) {
            zval *data;

            if (flag & HASH_ADD) {
                if (!(flag & HASH_UPDATE_INDIRECT)) {
                    return NULL;
                }
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if (Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                    if (Z_TYPE_P(data) != IS_UNDEF) {
                        return NULL;
                    }
                } else {
                    return NULL;
                }
            } else {
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if ((flag & HASH_UPDATE_INDIRECT) && Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                }
            }
            if (ht->pDestructor) {
                ht->pDestructor(data);
            }
            ZVAL_COPY_VALUE(data, pData);
            return data;
        }
    }

    ZEND_HASH_IF_FULL_DO_RESIZE(ht);        /* If the Hash table is full, resize it */

add_to_hash:
    idx = ht->nNumUsed++;
    ht->nNumOfElements++;
    p = ht->arData + idx;
    p->key = key = zend_string_init(str, len, GC_FLAGS(ht) & IS_ARRAY_PERSISTENT);
    p->h = ZSTR_H(key) = h;
    HT_FLAGS(ht) &= ~HASH_FLAG_STATIC_KEYS;
    ZVAL_COPY_VALUE(&p->val, pData);
    nIndex = h | ht->nTableMask;
    Z_NEXT(p->val) = HT_HASH(ht, nIndex);
    HT_HASH(ht, nIndex) = HT_IDX_TO_HASH(idx);

    return &p->val;
}

This means that if I can make this function get HASH_ADD | HASH_UPDATE_INDIRECT I may be able to update the hash table, let’s continue later, this was rough, but I am starting catching concepts.

Categories
Perl XS

More refactor to our XS module. Bringing the internals to a private C file.

In this article the code I did in the last article to allow the usage of the arguments list as a Hash is going to be moved to a C file to provide a way to reuse this code in all my subroutines of this module without exposing this API to Perl. (In Perl this job is already done.)

Let’s see how, first we are going to make the following directories into mega_openssl_helper_xs src and src/include and we are going to add to the Makefile.PL of mega_openssl_helper_xs an OBJECT and postamble directive so it gets the code into that folders.

use ExtUtils::MakeMaker;

WriteMakefile(
    NAME    => 'Peertube::DL::Mega::Helper',
    VERSION => '0.1',
    XS      => { 'mega.xs' => 'mega.o' },
    INC     => '-Isrc/include',
    OBJECT  => 'src/private.o mega.o',
    LDFLAGS => '-Wl-t',
    DIR     => ['src'],
);  
        
package MY {
            
    sub postamble {
        return . "src: src/Makefile\n" . "\tcd src && $(MAKE) $(PASSTHRU)\n";
    }       
} 

Now we are making a mega_openssl_helper_xs/src/Makefile.PL to compile the C that I am going to put into that directory:

use ExtUtils::MakeMaker;

WriteMakefile(
    NAME    => 'Peertube::DL::Mega::Helper::SRC',
    VERSION => '0.1',
    INC     => '-I./include',
    C       => [ 'private.c', ],
    OBJECT  => '${O_FILES}',
    LDFLAGS => '-Wl-t',
);

We make the src/include/private.h file to declare the subroutines for code reuse:

#include "EXTERN.h"
#include "perl.h"

HV * hash_from_list(SV **, size_t list_len);

And we move the code handling the hashes to this subroutine into src/private.c

#include "EXTERN.h"
#include "perl.h"

HV * hash_from_list(SV **list, size_t list_len) {
    HV * hash = newHV();
    bool is_key = true;
    char *key;
    STRLEN key_len;
    for ( int i = 0; i < list_len; i++ ) {
        if (is_key) {
            key = SvPV(list[i], key_len);
        } else {
            SvREFCNT_inc(list[i]);
            if ( !hv_store(hash, key, key_len, list[i], 0) ) {
                warn("Failed to write into hash.");
                SvREFCNT_dec(list[i]);
            }
        }
        is_key = !is_key;
    }
    if ( !is_key ) {
        warn("Even number of parameters in hash argument list.");
        SV *undef = sv_newmortal();
        SvREFCNT_inc(undef);
        if ( !hv_store(hash, key, key_len, sv_newmortal(), 0) ) {
            warn("Failed to write into hash.");
            SvREFCNT_dec(undef);
        }
    } 
    return hash;
}

Now our mega.xs print XSUB is more compact:

void
print(self, ...)
    Peertube_DL_Mega_Helper self 
    CODE:
        if ( items <= 1 ) {
            croak("Less parameters than expected.");
        }
        size_t list_len = items - 1;
        SV **list = malloc(sizeof (SV *) * list_len);
        for ( int i = 1; i < items; i++ ) {
            list[i-1] = ST(i);
        }

        HV *hash = hash_from_list(list, list_len);        
        free(list);
        char *key = "hello";
        SV **hello = hv_fetch(hash, key, strlen(key), 0);
        if ( hello != NULL && *hello != NULL ) {
            if ( !SvOK(*hello) ) {
                warn("hello is undef.");
            }
            printf("hello: %s\n", SvPV_nolen(*hello));
        } else {
            croak("Parameter hello required.");
        }

I hope you have enjoy this article about XS.

Erratas: sizeof was unfortunatelly not enought to get the size of the array, so I did it passing the size manually to the subroutine which sightly complicates the code.

Erratas II: I left some printf statements not needed.

Erratas III: I left a stdio.h import in the C file which was no longer needed, I used it with debugging purposes.

Categories
Perl XS

Playing more with XS: Retrieving subroutine arguments as a hash.

It’s common to use the argument list as a hash in Perl to provide a easier way for the Perl developers interface with subroutines, in this article I will explain how to do the same in XS. Note: not as easy as in Perl.

First we will change the signature like this to take variadic arguments:

void
print(self, ...)
    Peertube_DL_Mega_Helper self
CODE:

As you may now self is the caller object in Perl, not too much trouble, and … is saying basically “Pass me whatever and I will handle it.”, Peertube_DL_Mega_Helper is a custom typemap made by me in the last XS article.

Now we are going to check if users passed us more than a value, for example:

if ( item <= 1 ) {
    croak("Less parameters than expected.");
}

This is a little agresive way to handle this suppose, but this is a demo, feel free to take whatever behaviour you feel more appropiate for your case use.

Now we are going to define a hash to hold our arguments:

HV *hash = newHV();

And some variables useful in the argument processing loop:

bool is_key = true;
char *key;
STRLEN key_len;

And we define the loop to introduce those arguments into a our hash:

for ( int i = 1; i < items; i++ ) {
    if (is_key) {
        key = SvPV(ST(i), key_len);
    } else {    
        SvREFCNT_inc(ST(i));
        if ( !hv_store(hash, key, key_len, ST(i), 0) ) {
            SvREFCNT_dec(ST(i));
        }
    }
    is_key = !is_key;
}

Unfortunatelly this loop misses warning the developer if a even number of elements is passed with default to the last element without value to be undef, we are going to implement also that:

if ( !is_key ) { 
    warn("Even number of parameters in hash argument list.");
    SV *undef = sv_newmortal();
    SvREFCNT_inc(undef);
    if ( !hv_store(hash, key, key_len, sv_newmortal(), 0) ) {
        SvREFCNT_dec(undef);
    }
}

Croak is also an option, but this introduces how to add undefs and I thought would be educative to handle that case as Perl does™

Now we are going to use what we just did to search for the named parameter “hello”, let’s see how:

// Reusing the variable since the name is really convenient and descriptive.
key = "hello"; 
SV **hello = hv_fetch(hash, key, strlen(key), 0);
if ( hello != NULL && *hello != NULL ) {
    if ( !SvOK(*hello) ) {
        warn("hello is undef.");
    }
    printf("hello: %s\n", SvPV_nolen(*hello));
} else {
    croak("Parameter hello required.");
}

After compiling with cpanm . -v to catch errors we are going to run the usual oneliners to check the capabilities we added, I may do an article of testing in Perl later, now I am just learning XS.

sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print("hello" => "world")';
hello: world
sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print';
Less parameters than expected. at -e line 1.
sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print("hello" => "world", "hello" => )';
Even number of parameters in hash argument list. at -e line 1.
hello is undef. at -e line 1.
hello: 

I hope you enjoyed the article.

Categories
Perl

Why Perl is my favourite programming language in 2021?

Perl started it’s development in 1987 before the born of the Linux kernel and gained popularity as a web development language for CGI scripting during the 90s since then many developers have categorized the language as a write only language, but I think this opinions are deprecated with the today features of modern Perl plus the expressiveness Perl have been given to it’s developers since it’s starting.

Perl has three important types of variables, each of one features a sigil which must be before any representation of that variable.

$ features a Scalar variable, a variable which can hold only a element.

@ features an Array variable, a variable that holds a list of elements.

% features a Hash variable, a variable that holds a map of String to Scalar without order.

This differentiation plus the usage of “use strict;’ makes the development of Perl easier since many complex data type errors can be caught on compile time.

Perl is a dynamic typed programming language as Python or Javascript, so you can assign a Scalar variable any value like a String, a Number, Undef (Perl has no null.) or a Reference to a more complex type of variable.

By default Perl interprets that if you are passing a Array or Hash to a Subroutine what you want is passing the contents of this structure as the parameters of the Subroutine which makes some constructs longer in some programming languages to be written without hastle.

Let’s put it in a example:

use strict;
use warnings;

use feature 'say';

sub ExampleSubroutine {
    my ($a, $b, $c) = @_;
    say $a;
    say $b;
    say $c;
}

my @params = ('a', 'b', 'c');
ExampleSubroutine(@params);

The lack of need to specify signatures to subroutines allows a lot of expresiveness also, althought you may use libraries like https://metacpan.org/pod/Params::ValidationCompiler in more complex programs to get a self-documenting interface to your methods and subroutines.

Testing in Perl is also great thanks to great resources from cpan like https://metacpan.org/pod/Test::MockObject, https://metacpan.org/pod/Test::MockModule and https://metacpan.org/pod/Test::Most and the flexibity of Perl on mocking subroutines that allows to test both procedural and OOP in a really easy way no matter there was not test before unlike PHP where mocking it’s limited to object methods.

Tons of the Perl cpan modules are packaged in GNU/Linux distributions so if you want to package your software to work in a GNU/Linux distribution it is not a so difficult task.

Perl also grants its developers a great UTF-8 optionally aware regex engine which helps to parse complex data in a easy manner and it’s the most common way to manipulate Strings in Perl.

The Perl object’s are just blessed references which allows your object to be a Hash (The most common since allows the common key/value internal scheme of an Object.), an Array, an Subroutine (To force encapsulation by having the real attributes in a more restricted scope.)

Perl anonymous subroutines inherit the scope allowing to do some functional programming tricks.

There are great libraries to work with files in Perl like https://metacpan.org/pod/Path::Tiny which allows a near natural language way to manipulate files without having to be so aware about file handles or directory handles as the default. (The default sometimes also plays well in scripts.)

The Perl command line debugger is simply great, allowing to discover bugs in a easy and reproducible way so you can share debugging sessions as plain text with other programmers.

The default variable $_ allows the programmer to not to be having to name a variable that is used in a very limited scope before stopping being useful.

The for syntax is really cool $object->doSomething for my $object (@objects) or it’s anonymous counterpart $_->doSomething for (@objects)

It’s not always needed to put parents on subroutine calls allowing the developer to type less when the scope is clear.

It is also great the “There is more than a way to do it” philosophy behind the language, allowing you to choose the best way to do for your problem.

When we talk about optimization we should first know for what we want to optimize like Speed, CPU, Memory Usage or code Legibility it’s not uncommon you have to trade one to have the other, for example a complex and extensive data structure can be stored in disk to save memory but accesing it will harm Speed, CPU usage and Legibility by increasing the latency because disk reads, increasing the CPU usage because the aditional cycles to retrieve the data when needed and Legibility by doing the program more complex.

It’s is great a language takes different problems have different needs and adapt itself to the multiple ways to tackle a problem a developer may have.

Categories
Perl XS

Perl niceness, building a XS module that acts like a Perl object.

I have been trying the last days to add Mega.nz support to Peertube-dl, it is not an easy task since Mega.nz implements encryption with AES in the files so the server owner is not able in theory to read the users files. (My opinion is anyway the server admins would be able to get access to those files by simply parsing the server logs.)

I have not success yet, because the lack of good Perl libraries and the bad Perl is for parsing binary data and then I thought, why not try to use XS to solve this problem as I did before with the Javascript interpreter need for Youtube?

So this I am trying, and in the process I did a little improvement in my way to write XS thanks to the learn about how to use typemaps to build more ideomatic for both Perl and C XS modules.

Let’s start the house by the roof, would not be great being able to write a XS module like this:

#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <stdio.h>

typedef struct helper {
    int value;
} * Peertube_DL_Mega_Helper;

MODULE = Peertube::DL::Mega::Helper  PACKAGE = Peertube::DL::Mega::Helper
PROTOTYPES: DISABLE

Peertube_DL_Mega_Helper
new(class, value)
    char *class
    int value
    CODE:
        RETVAL = malloc(sizeof (struct helper));
        RETVAL->value = value;
    OUTPUT:
        RETVAL

void
print(self)
    Peertube_DL_Mega_Helper self
    CODE:
        printf("%d\n", self->value);

void
DESTROY(self)
    Peertube_DL_Mega_Helper self
    CODE:
        free(self);

This code can be achieved with the help of the file typemap, with the help of https://perldoc.perl.org/perlxstypemap I discovered that, and the following typemap will look familiar to anybody which have read that document.

TYPEMAP
    Peertube_DL_Mega_Helper T_PTROBJ_SPECIAL

INPUT
T_PTROBJ_SPECIAL
    if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
        IV tmp = SvIV((SV*)SvRV($arg));
        $var = INT2PTR($type, tmp);
    } else {
        croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\");
    }

OUTPUT
T_PTROBJ_SPECIAL
    sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
           (void*)$var);

Of course some more additions will be needed to make this code run, like the clasical xsloader Perl module and some lines in the Makefile.PL which is out of scope in this document.

We will be able to run this code like this and check it works:

perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print';

It should print 3.

It is true XS is scaring because strange subroutines and magical variable names and tons of rare sections difficult to understand on the first read of the docs, but this kind of sugar makes my day, it is a shame there aren’t much XS articles and Stackoverflow responses because makes the language slower to learn.