Categories
Mojolicious Perl Test::Most

BeastBB, my new project to build a Bulletin Board like webpage using Mojolicious.

https://gitea.sergiotarxz.freemyip.com/sergiotarxz/BeastBB

Just to improve with Mojolicious as Perl web developer I started a few weeks ago a project called BeastBB which I want to be just another Bulletin Board webpage.

In this project I am learning a lot about Mojo::Pg, for example I learned how to test it using DBD::Mock, and also learned to use SQL::Abstract::Pg which makes querying simple things or inserts/updates less overhead.

Also in the development of this project I found how to avoid a nasty error which required my Makefile.PL to run the install target two times before the templates being correctly installed.

I also achieved .config/beastbb/ config file installation which is great because mades the installation project dir agnostic, you can run cpanm . –installdeps && cpanm . -v and then delete the project dir and the webpage will work anyway. The database migrations files also get installed automatically and if new migrations are found after an upgrade they will be run without the need to do it manually.

I created an object called BeastBB::Response which is used to control controllable errors, you typically would use it like this:

my $response = FunctionThatMayFail();
return $self->reply->exception('Function failed because the data is wrong with message ' . $response->ErrorMessage)->rendered(400) if $response->IsError;
my $data = $response->Content;
doThings($data);

The BeastBB::Response code is something like this:

package BeastBB::Response;

use 5.32.1;

use strict;
use warnings;

use Carp qw/confess cluck/;

use Params::ValidationCompiler 'validation_for';
use Types::Standard qw/Bool Str Any/;

{
    my $validator = validation_for(
        params => {
            is_error      => { type => Bool, default  => 0 },
            content       => { type => Any,  optional => 1 },
            error_message => { type => Str,  optional => 1 },
        }
    );

    sub new {
        my $class    = shift;
        my %params   = $validator->(@_);
        my $is_error = $params{is_error};
        my $content;
        my $error_message;

        if ( exists $params{content} ) {
            $content = $params{content};
        }
        if ( exists $params{error_message} ) {
            $error_message = $params{error_message};
        }

        if ($is_error) {
            cluck 'Error should not have content, stripping it.'
              if defined $content;
            cluck 'You should pass a error message on error.'
              if !defined $error_message;
            return bless {
                is_error      => 1,
                error_message => $error_message // '',
            }, $class;
        }
        return bless { content => $content }, $class;
    }
}

sub IsError {
    my $self = shift;
    return $self->{is_error};
}

sub ErrorMessage {
    my $self = shift;
    if ( !$self->IsError ) {
        confess 'This is not an error.';
    }
    return $self->{error_message};
}

sub Content {
    my $self = shift;
    if ( $self->IsError ) {
        confess 'Attempt to get content from error.';
    }
    return $self->{content};
}
1;

I trying to test most I do, but the 100% coverage is still a challenge because testing controllers is hard and sometimes I am to lazy to test too trivial things:

I didn’t achieve yet to split the Mock objects from the real code, I suppose more Mock objects will be needed to avoid to do large objects constructions in most tests like the ones needed to build a User which is a object which takes many parameters.

Params::ValidationCompiler is making my live easier than it was with Params::Validate allowing me to define my own types like matrix_address or asking for a concrete class.

I made a class with a few utility types to reuse them everywhere called BeastBB::Types:

package BeastBB::Types;

use 5.30.3;

use strict;
use warnings;

use Exporter qw/import/;
use Scalar::Util qw/blessed/;
use Type::Tiny;

use Const::Fast;

our @EXPORT_OK = ( '&IsClassTypeGenerator', '$MATRIX_ADDRESS_REGEX', '$MATRIX_ADDRESS_TYPE' );

const our $MATRIX_ADDRESS_REGEX => qr/^@\w+:(\w|\.)+\.(\w+)$/;


const our $MATRIX_ADDRESS_TYPE => Type::Tiny->new(
    name => "MatrixAddressChecker",
    constraint => sub {
        my $matrix_address = shift;
        return 1 if $matrix_address =~ /$MATRIX_ADDRESS_REGEX/;
    }
);

my %generated_classes;

sub IsClassTypeGenerator {
    my $class = shift;
    if ( !exists $generated_classes{$class} ) {
        my $sanitized_class = $class =~ s/:://gr;
        $generated_classes{$class} = Type::Tiny->new(
            name => "Is$sanitized_class",
            constraint => sub {
                my $item_to_test = shift;
                return 1 if blessed $item_to_test && $item_to_test->isa($class);
                return 0;
            },
        );
    }
    return $generated_classes{$class};
}
1;

It’s being really fun to build BeastBB with Perl thanks to all the libraries I can use, in my cpanfile you can find all those:

requires 'Mojolicious';
requires 'Mojo::Pg';
requires 'ExtUtils::MakeMaker';
requires 'Crypt::URandom';
requires 'DBD::Pg';
requires 'DBD::Mock';
requires 'Const::Fast';
requires 'Params::ValidationCompiler';
requires 'Types::Standard';
requires 'Crypt::Bcrypt::Easy';
requires 'DateTime';
requires 'DateTime::Format::Pg';
requires 'Test::Most';
requires 'Test::MockModule';
requires 'Test::Warnings';

They may not seem to be too much, but they give me the capabilities of their dependencies like does Mojo::Pg with SQL::Abstract::Pg.

Using Perl to build a web is a fun exercise with Mojolicious.

Categories
notest Perl PHP

Creating reproducible tests in pull requests so reviewers can know the code is properly working when your codebase lacks unit tests.

Everyone who has worked before with unit testing will know that pull request based testing is highly inefficient on the sense tests are going to be lost once the pull request is approved. But meanwhile automated testing is implemented in a legacy project code has still to be done to implement new features or bugfixes.

That is where the guideline I am going to expose has it’s niche, trying to get the job done while preserving a sane behaviour in the code.

This guide assumes a web based project, some parts may have to be adapted on discrection for other use cases.

General Guidelines.

  • Avoid high level instructions that can be written as commands.

It is usually the best to avoid saying the reviewers to do something that they may have to investigate when you can write some fancy command that does it. Example:

Bad example:

“Block the black haired users from having a avatar.”

Good example:

“Execute the following sql query to block the black haired users from having a avatar.”

update set avatar_blocked=1 from users where hair = 'black';

Other bad example:

Remove lines from 500-550 from lib/Users/BlockAvatar.php since they attempt to connect to a external ftp and are going to generate an error and set $got_external_csv to 0.

Good example:

Remove lines from 500-550 from lib/Users/BlockAvatar.php since they attempt to connect to a external ftp and are going to generate an error with this command and set $got_external_csv to 0:

perl <( cat << 'EOF'
use 5.30.0;
my $i = 0;
while (<>) {
    $i++; 
    next if $i >= 500 && $i <= 550;
    say << 'END_OF_SAY' if $i == 551;
        # Temporal fix to avoid ftp connections in testing.
        $got_external_csv = 0;
END_OF_SAY
    print;
}
EOF
) lib/Users/BlockAvatar.php > block_avatar_tmp.php
cp block_avatar_tmp.php lib/Users/BlockAvatar.php
rm block_avatar_tmp.php
  • Avoid to write a database query as a bash command and write the query directly so everyone can use whatever database client they find more comfortable with.

Bad example:

echo 'select * from users' | mysql

Good example:

select * from users;
  • Avoid to write operations in the webpage that users are supposed to do as bash commands and instead send the reviewers to do those operations unless it is really needed.

This is mainly because two reasons, if you bindly copy as Posix Curl a Firefox request you have chances to collide against csfr tokens or leak your authentication cookies, not a good deal, also you have chances that if you broke something in frontend in your changes it gets unnoticed in the pull request.

Bad example:

“Delete the Luffy user”

curl -X DELETE www.myweb.com/api/user/luffy

Good example:

Go to https://www.myweb.com/admin/manage_user?user=luffy and press delete this user.

  • Include screenshots of GUI steps if possible indicating where should be reviewer interact with the webpage and how.

Those screenshots should not be an alternative against text description but a complement to avoid blind reviewer discrimination, it should be a help for people with visual minds, not a diversity killer.

Categories
git GNU/Linux ssh Windows

Dealing with SSH key management in a mixed Windows and GNU/Linux environment with WSL.

I am going to drop a bunch of tips which may be useful to people using SSH in a extensive way for development to authenticate against their remote git user in say Gitlab, Gitea, Github or Bitbucket and happens to not being able to work in the best operative system for development.

The first thing is SSH key generation, you surely would not like to have to update your SSH key in the remote Git server because you accidentally generated a new pair of keys so you should be careful while typing the ssh-keygen, you should do it before submiting your key to the remote server and NEVER again.

Choose carefully the root of your ssh keys in every computer and put yourself the rule of never overriding that SSH key pair to avoid losing accidentally the access to the remote repository.

That said you install your GNU/Linux distribution of choice, generate a key pair and submit the .ssh/id_rsa.pub to the remote repository and then you should be able to work from your WSL user cloning repositories, but then you want to clone the repository in for example a directory owned by www-data.

You may try:

cd /var/www/
sudo git clone ssh://user@host/myfancyrepo

That will give you a beautiful permission denied and you may stick confused because of this, but you should be aware ssh keys are private for every user, so if you think about it, it’s simply logic root has different ssh keys than your user.

A correct approach would be:

sudo mkdir ~root/.ssh
sudo cp ~/.ssh/id_rsa{,.pub} ~root/.ssh
sudo git clone ssh://user@host/myfancyrepo

But correct is not good, a better approach would be to do it with the user owner of the folder like this since when the directory with the repository is created if it needs some sort of installation, say composer or npm you will be tempted to do that installation with root which is absolutely discouraged.

sudo mkdir ~www-data/.ssh
sudo cp ~/.ssh/id_rsa{,.pub} ~www-data/.ssh/
sudo -u www-data git clone ssh://user@host/myfancyrepo

If you are too often having to use git repositories outside your WSL environment in Windows folders, you will find soon how slow it can get, I recommend you to install the Git for Windows and ensure it is in the Windows PATH and then add this lines to your .bashrc:

export WIN_DIR=/mnt/c
export GIT=$(which git)

git_wrapper() {
    if perl -e 'exit int($ARGV[1] !~ /^@{[$ARGV[0]]}(?:\/|$)/)' $WIN_DIR $PWD; then
        git.exe $@ 
    else  
        $GIT $@
    fi      
}

alias git=git_wrapper 

You will need to copy your SSH keys to the Windows user like this:

sudo mkdir -pv /mnt/c/Users/<youruser>/.ssh/
sudo cp ~/.ssh/{id_rsa{,.pub} /mnt/c/Users/<youruser>/.ssh/

This should allow you to painlessly be able to use wsl to work with both WSL repositories and Windows ones.

This post is a compilation of common issues I have been seen and suffered during the development in Windows in my job and I hope it helps somebody else to do not fall in some traps to the newbies leading with SSH key management.

Categories
PHP Zend

Hacking PHP to allow the redeclaration of functions. (II)

This is the next step in my adventure trying to redeclarate functions in PHP, you can see the previous effort in the linked post.

We are going again to the do_bind_function to use what we learned in the last chapter if you look at it is looks strange know since it is finding the zv key when it has still not putting a value in the table.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{   
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

Would be great to solve that mistery before getting forward breaking things. We search for that function:

sergio@bahdder ~/php-8.0.3 $ rg do_bind_function
Zend/zend_compile.h
776:ZEND_API zend_result do_bind_function(zval *lcname);

Zend/zend_vm_def.h
7589:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_vm_execute.h
2821:   do_bind_function(RT_CONSTANT(opline, opline->op1));

Zend/zend_compile.c
1054:ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
1061:        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
1072:        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);

UPGRADING.INTERNALS
281:        - do_bind_function()

And then we are going to look at zend_vm_def.h what it does.

ZEND_VM_HANDLER(141, ZEND_DECLARE_FUNCTION, ANY, ANY)
{   
    USE_OPLINE
    
    SAVE_OPLINE();
    do_bind_function(RT_CONSTANT(opline, opline->op1));
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Macros everywhere…

Zend/zend_vm_execute.h
402:# define SAVE_OPLINE() EX(opline) = opline

This gives a important hint about how opline is declared, but not why the function is expected to be in the hash table.

Whatever, I am going to do this modification and restore do_bind_function_error.

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{       
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}
sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

It is still happening, let’s look elsewhere…

if (toplevel) {
        if (zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL) {
            zend_hash_update_ptr(CG(function_table), lcname, op_array);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }
ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;
    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
        if (!zv) {
            zv = zend_hash_update(EG(function_table), Z_STR_P(lcname), zv);
        }
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

Let’s try now…

Let’s create a file named a.php with this content:

<?php
function a() {
    print "hola";
}

And now we are going to try this:

 b/usr/local/bin/php -r 'function a(){} a(); include "a.php"; a();'
sergio@bahdder ~/php-8.0.3 $ b/usr/local/bin/php -r 'function a(){ echo "adios"; } a(); include "a.php"; a();'
adiosholasergio@bahdder ~/php-8.0.3 $ 

This appears to have worked, if I continue with this I will try to do this with a function named mock in a extension to get into extension development keeping my hands out of core.

Categories
PHP Uncategorized Zend

Hacking PHP to allow redeclaration of functions. (I)

It is often said that a developer of some language is able to write that language in any language. I can tell you this is a great true.

Currently I am in a project which uses the PHP technology and when testing it PHP does nothing but get into my way when trying to write Perl in PHP not allowing me to redeclare procedural functions.

I have seen responses in like this https://stackoverflow.com/questions/1244949/mocking-php-functions-in-unit-tests which recommend the usage of runkit, but I thought it would be could to get into the PHP code and try to implement such a thing.

The PHP code is a pretty large codebase so searching for this concrete restriction would be hard, but I have a hint given by the own PHP, let’s run this:

sergio@bahdder ~/php-8.0.3 $ php -r 'function a(){} function a(){ print "hello";} a();'
PHP Fatal error:  Cannot redeclare a() (previously declared in Command line code:1) in Command line code on line 1

Let’s search this message and try to get PHP do what I mean just for learning how the function declaration works in the PHP source code.

rg 'Cannot redeclare'

Ups, too much output… Let’s try again stripping some directories from the rg used in test which right now are not interesting.

rg 'Cannot redeclare' --glob '!Zend/tests' --glob '!tests'

That’s is better:

Zend/zend_compile.c
1065:           zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
1070:           zend_error_noreturn(error_level, "Cannot redeclare %s()",
6499:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
6817:           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::%s()",
7064:                   zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s::$%s",
7658:                           "Cannot redeclare constant '%s'", ZSTR_VAL(unqualified_name));

Zend/zend_inheritance.c
1038:                           zend_error_noreturn(E_COMPILE_ERROR, "Cannot redeclare %s%s::$%s as %s%s::$%s",

ext/opcache/zend_accelerator_util_funcs.c
468:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
473:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));
512:            zend_error(E_ERROR, "Cannot redeclare %s() (previously declared in %s:%d)",
517:            zend_error(E_ERROR, "Cannot redeclare %s()", ZSTR_VAL(function1->common.function_name));

That gives me a much better hint, Zend/zend_compile.c and ext/opcache/zend_accelerator_util_funcs.c can be the cause of this, let’s see how them work starting by Zend/zend_compile.c.

I got that function:

static zend_never_inline ZEND_COLD ZEND_NORETURN void do_bind_function_error(zend_string *lcname, zend_op_array *op_array, zend_bool compile_time) /* {{{ */
{
    zval *zv = zend_hash_find_ex(compile_time ? CG(function_table) : EG(function_table), lcname, 1);
    int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
    zend_function *old_function;

    ZEND_ASSERT(zv != NULL);
    old_function = (zend_function*)Z_PTR_P(zv);
    if (old_function->type == ZEND_USER_FUNCTION
        && old_function->op_array.last > 0) {
        zend_error_noreturn(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                    op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name),
                    ZSTR_VAL(old_function->op_array.filename),
                    old_function->op_array.opcodes[0].lineno);
    } else {
        zend_error_noreturn(error_level, "Cannot redeclare %s()",
            op_array ? ZSTR_VAL(op_array->function_name) : ZSTR_VAL(old_function->common.function_name));
    }
}

This function is likely called when PHP reachs an state where function redeclaration is done, but the purpose is only logging that and fail, I will delete this method to figure out what breaks in compilation, I am likely not wanting this function anymore anyway.

Shamelessly I run make -j4 knowing this is not going to compile anymore.

/home/sergio/php-8.0.3/Zend/zend_compile.c: In function ‘do_bind_function’:
/home/sergio/php-8.0.3/Zend/zend_compile.c:1063:3: warning: implicit declaration of function ‘do_bind_function_error’; did you mean ‘do_bind_function’? [-Wimplicit-function-declaration]
 1063 |   do_bind_function_error(Z_STR_P(lcname), NULL, 0);
      |   ^~~~~~~~~~~~~~~~~~~~~~
      |   do_bind_function

Ok, let’s see what is happening here.

I got into the do_bind_function it looks promising:

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), NULL, 0);
        return FAILURE;
    }
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    if (UNEXPECTED(!zv)) {
        do_bind_function_error(Z_STR_P(lcname), &function->op_array, 0);
        return FAILURE;
    }
    return SUCCESS;
}

This is not Kansas anymore, a good bunch of strange macros and functions that are not familiar to me are here, like UNEXPECTED or EG, but PHP developer were enough kind to make good variable names and functions so I maybe be able to tweak a little the code.

Maybe silencing the error is enough…

ZEND_API zend_result do_bind_function(zval *lcname) /* {{{ */
{    
    zend_function *function;
    zval *rtd_key, *zv;

    rtd_key = lcname + 1;
    zv = zend_hash_find_ex(EG(function_table), Z_STR_P(rtd_key), 1);
    function = (zend_function*)Z_PTR_P(zv);
    if (UNEXPECTED(function->common.fn_flags & ZEND_ACC_PRELOADED)
            && !(CG(compiler_options) & ZEND_COMPILE_PRELOAD)) {
        zv = zend_hash_add(EG(function_table), Z_STR_P(lcname), zv);
    } else {
        zv = zend_hash_set_bucket_key(EG(function_table), (Bucket*)zv, Z_STR_P(lcname));
    }
    return SUCCESS;
}

mkdir b && make -j4 && INSTALL_ROOT=b make install

But there are more calls to do_bind_error:

    if (toplevel) {
        if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
            do_bind_function_error(lcname, op_array, 1);
        }
        zend_string_release_ex(lcname, 0);
        return;
    }

This gives me bad feelings, I don’t think this == NULL means the key is updated anyway, maybe I supposed too much, let’s look what it does, I may have to go back and rethink all the previous changes.

static zend_always_inline void *zend_hash_add_ptr(HashTable *ht, zend_string *key, void *pData)
{
    zval tmp, *zv;

    ZVAL_PTR(&tmp, pData);
    zv = zend_hash_add(ht, key, &tmp);
    if (zv) {
        ZEND_ASSUME(Z_PTR_P(zv));
        return Z_PTR_P(zv);
    } else {
        return NULL;
    }
}

Let’s look into zend_hash_add…

Ups I made too much assumptions…

ZEND_API zval* ZEND_FASTCALL zend_hash_add(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_update_ind(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_UPDATE | HASH_UPDATE_INDIRECT);
}

ZEND_API zval* ZEND_FASTCALL zend_hash_add_new(HashTable *ht, zend_string *key, zval *pData)
{
    return _zend_hash_add_or_update_i(ht, key, pData, HASH_ADD_NEW);
}

If there is a update it is clear that add won’t update, I also found a interesting function:

ZEND_API zval* ZEND_FASTCALL zend_hash_add_or_update(HashTable *ht, zend_string *key, zval *pData, uint32_t flag)
{
    if (flag == HASH_ADD) {
        return zend_hash_add(ht, key, pData);
    } else if (flag == HASH_ADD_NEW) {
        return zend_hash_add_new(ht, key, pData);
    } else if (flag == HASH_UPDATE) {
        return zend_hash_update(ht, key, pData);
    } else {
        ZEND_ASSERT(flag == (HASH_UPDATE|HASH_UPDATE_INDIRECT));
        return zend_hash_update_ind(ht, key, pData);
    }
}

Unfortunatelly this means the bitmask the flag contains. (I looked at it without you cannot be used to tell the hashtable to do insert or update whatever it needs, let’s look in the private method they all call to see if it is true what I am thinking.

static zend_always_inline zval *_zend_hash_str_add_or_update_i(HashTable *ht, const char *str, size_t len, zend_ulong h, zval *pData, uint32_t flag)
{
    zend_string *key;
    uint32_t nIndex;
    uint32_t idx;
    Bucket *p;

    IS_CONSISTENT(ht);
    HT_ASSERT_RC1(ht);

    if (UNEXPECTED(HT_FLAGS(ht) & (HASH_FLAG_UNINITIALIZED|HASH_FLAG_PACKED))) {
        if (EXPECTED(HT_FLAGS(ht) & HASH_FLAG_UNINITIALIZED)) {
            zend_hash_real_init_mixed(ht);
            goto add_to_hash;
        } else {
            zend_hash_packed_to_hash(ht);
        }
    } else if ((flag & HASH_ADD_NEW) == 0) {
        p = zend_hash_str_find_bucket(ht, str, len, h);

        if (p) {
            zval *data;

            if (flag & HASH_ADD) {
                if (!(flag & HASH_UPDATE_INDIRECT)) {
                    return NULL;
                }
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if (Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                    if (Z_TYPE_P(data) != IS_UNDEF) {
                        return NULL;
                    }
                } else {
                    return NULL;
                }
            } else {
                ZEND_ASSERT(&p->val != pData);
                data = &p->val;
                if ((flag & HASH_UPDATE_INDIRECT) && Z_TYPE_P(data) == IS_INDIRECT) {
                    data = Z_INDIRECT_P(data);
                }
            }
            if (ht->pDestructor) {
                ht->pDestructor(data);
            }
            ZVAL_COPY_VALUE(data, pData);
            return data;
        }
    }

    ZEND_HASH_IF_FULL_DO_RESIZE(ht);        /* If the Hash table is full, resize it */

add_to_hash:
    idx = ht->nNumUsed++;
    ht->nNumOfElements++;
    p = ht->arData + idx;
    p->key = key = zend_string_init(str, len, GC_FLAGS(ht) & IS_ARRAY_PERSISTENT);
    p->h = ZSTR_H(key) = h;
    HT_FLAGS(ht) &= ~HASH_FLAG_STATIC_KEYS;
    ZVAL_COPY_VALUE(&p->val, pData);
    nIndex = h | ht->nTableMask;
    Z_NEXT(p->val) = HT_HASH(ht, nIndex);
    HT_HASH(ht, nIndex) = HT_IDX_TO_HASH(idx);

    return &p->val;
}

This means that if I can make this function get HASH_ADD | HASH_UPDATE_INDIRECT I may be able to update the hash table, let’s continue later, this was rough, but I am starting catching concepts.

Categories
Perl XS

More refactor to our XS module. Bringing the internals to a private C file.

In this article the code I did in the last article to allow the usage of the arguments list as a Hash is going to be moved to a C file to provide a way to reuse this code in all my subroutines of this module without exposing this API to Perl. (In Perl this job is already done.)

Let’s see how, first we are going to make the following directories into mega_openssl_helper_xs src and src/include and we are going to add to the Makefile.PL of mega_openssl_helper_xs an OBJECT and postamble directive so it gets the code into that folders.

use ExtUtils::MakeMaker;

WriteMakefile(
    NAME    => 'Peertube::DL::Mega::Helper',
    VERSION => '0.1',
    XS      => { 'mega.xs' => 'mega.o' },
    INC     => '-Isrc/include',
    OBJECT  => 'src/private.o mega.o',
    LDFLAGS => '-Wl-t',
    DIR     => ['src'],
);  
        
package MY {
            
    sub postamble {
        return . "src: src/Makefile\n" . "\tcd src && $(MAKE) $(PASSTHRU)\n";
    }       
} 

Now we are making a mega_openssl_helper_xs/src/Makefile.PL to compile the C that I am going to put into that directory:

use ExtUtils::MakeMaker;

WriteMakefile(
    NAME    => 'Peertube::DL::Mega::Helper::SRC',
    VERSION => '0.1',
    INC     => '-I./include',
    C       => [ 'private.c', ],
    OBJECT  => '${O_FILES}',
    LDFLAGS => '-Wl-t',
);

We make the src/include/private.h file to declare the subroutines for code reuse:

#include "EXTERN.h"
#include "perl.h"

HV * hash_from_list(SV **, size_t list_len);

And we move the code handling the hashes to this subroutine into src/private.c

#include "EXTERN.h"
#include "perl.h"

HV * hash_from_list(SV **list, size_t list_len) {
    HV * hash = newHV();
    bool is_key = true;
    char *key;
    STRLEN key_len;
    for ( int i = 0; i < list_len; i++ ) {
        if (is_key) {
            key = SvPV(list[i], key_len);
        } else {
            SvREFCNT_inc(list[i]);
            if ( !hv_store(hash, key, key_len, list[i], 0) ) {
                warn("Failed to write into hash.");
                SvREFCNT_dec(list[i]);
            }
        }
        is_key = !is_key;
    }
    if ( !is_key ) {
        warn("Even number of parameters in hash argument list.");
        SV *undef = sv_newmortal();
        SvREFCNT_inc(undef);
        if ( !hv_store(hash, key, key_len, sv_newmortal(), 0) ) {
            warn("Failed to write into hash.");
            SvREFCNT_dec(undef);
        }
    } 
    return hash;
}

Now our mega.xs print XSUB is more compact:

void
print(self, ...)
    Peertube_DL_Mega_Helper self 
    CODE:
        if ( items <= 1 ) {
            croak("Less parameters than expected.");
        }
        size_t list_len = items - 1;
        SV **list = malloc(sizeof (SV *) * list_len);
        for ( int i = 1; i < items; i++ ) {
            list[i-1] = ST(i);
        }

        HV *hash = hash_from_list(list, list_len);        
        free(list);
        char *key = "hello";
        SV **hello = hv_fetch(hash, key, strlen(key), 0);
        if ( hello != NULL && *hello != NULL ) {
            if ( !SvOK(*hello) ) {
                warn("hello is undef.");
            }
            printf("hello: %s\n", SvPV_nolen(*hello));
        } else {
            croak("Parameter hello required.");
        }

I hope you have enjoy this article about XS.

Erratas: sizeof was unfortunatelly not enought to get the size of the array, so I did it passing the size manually to the subroutine which sightly complicates the code.

Erratas II: I left some printf statements not needed.

Erratas III: I left a stdio.h import in the C file which was no longer needed, I used it with debugging purposes.

Categories
Perl XS

Playing more with XS: Retrieving subroutine arguments as a hash.

It’s common to use the argument list as a hash in Perl to provide a easier way for the Perl developers interface with subroutines, in this article I will explain how to do the same in XS. Note: not as easy as in Perl.

First we will change the signature like this to take variadic arguments:

void
print(self, ...)
    Peertube_DL_Mega_Helper self
CODE:

As you may now self is the caller object in Perl, not too much trouble, and … is saying basically “Pass me whatever and I will handle it.”, Peertube_DL_Mega_Helper is a custom typemap made by me in the last XS article.

Now we are going to check if users passed us more than a value, for example:

if ( item <= 1 ) {
    croak("Less parameters than expected.");
}

This is a little agresive way to handle this suppose, but this is a demo, feel free to take whatever behaviour you feel more appropiate for your case use.

Now we are going to define a hash to hold our arguments:

HV *hash = newHV();

And some variables useful in the argument processing loop:

bool is_key = true;
char *key;
STRLEN key_len;

And we define the loop to introduce those arguments into a our hash:

for ( int i = 1; i < items; i++ ) {
    if (is_key) {
        key = SvPV(ST(i), key_len);
    } else {    
        SvREFCNT_inc(ST(i));
        if ( !hv_store(hash, key, key_len, ST(i), 0) ) {
            SvREFCNT_dec(ST(i));
        }
    }
    is_key = !is_key;
}

Unfortunatelly this loop misses warning the developer if a even number of elements is passed with default to the last element without value to be undef, we are going to implement also that:

if ( !is_key ) { 
    warn("Even number of parameters in hash argument list.");
    SV *undef = sv_newmortal();
    SvREFCNT_inc(undef);
    if ( !hv_store(hash, key, key_len, sv_newmortal(), 0) ) {
        SvREFCNT_dec(undef);
    }
}

Croak is also an option, but this introduces how to add undefs and I thought would be educative to handle that case as Perl does™

Now we are going to use what we just did to search for the named parameter “hello”, let’s see how:

// Reusing the variable since the name is really convenient and descriptive.
key = "hello"; 
SV **hello = hv_fetch(hash, key, strlen(key), 0);
if ( hello != NULL && *hello != NULL ) {
    if ( !SvOK(*hello) ) {
        warn("hello is undef.");
    }
    printf("hello: %s\n", SvPV_nolen(*hello));
} else {
    croak("Parameter hello required.");
}

After compiling with cpanm . -v to catch errors we are going to run the usual oneliners to check the capabilities we added, I may do an article of testing in Perl later, now I am just learning XS.

sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print("hello" => "world")';
hello: world
sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print';
Less parameters than expected. at -e line 1.
sergio@tiki ~/Peertube-dl $ perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print("hello" => "world", "hello" => )';
Even number of parameters in hash argument list. at -e line 1.
hello is undef. at -e line 1.
hello: 

I hope you enjoyed the article.

Categories
Perl

Why Perl is my favourite programming language in 2021?

Perl started it’s development in 1987 before the born of the Linux kernel and gained popularity as a web development language for CGI scripting during the 90s since then many developers have categorized the language as a write only language, but I think this opinions are deprecated with the today features of modern Perl plus the expressiveness Perl have been given to it’s developers since it’s starting.

Perl has three important types of variables, each of one features a sigil which must be before any representation of that variable.

$ features a Scalar variable, a variable which can hold only a element.

@ features an Array variable, a variable that holds a list of elements.

% features a Hash variable, a variable that holds a map of String to Scalar without order.

This differentiation plus the usage of “use strict;’ makes the development of Perl easier since many complex data type errors can be caught on compile time.

Perl is a dynamic typed programming language as Python or Javascript, so you can assign a Scalar variable any value like a String, a Number, Undef (Perl has no null.) or a Reference to a more complex type of variable.

By default Perl interprets that if you are passing a Array or Hash to a Subroutine what you want is passing the contents of this structure as the parameters of the Subroutine which makes some constructs longer in some programming languages to be written without hastle.

Let’s put it in a example:

use strict;
use warnings;

use feature 'say';

sub ExampleSubroutine {
    my ($a, $b, $c) = @_;
    say $a;
    say $b;
    say $c;
}

my @params = ('a', 'b', 'c');
ExampleSubroutine(@params);

The lack of need to specify signatures to subroutines allows a lot of expresiveness also, althought you may use libraries like https://metacpan.org/pod/Params::ValidationCompiler in more complex programs to get a self-documenting interface to your methods and subroutines.

Testing in Perl is also great thanks to great resources from cpan like https://metacpan.org/pod/Test::MockObject, https://metacpan.org/pod/Test::MockModule and https://metacpan.org/pod/Test::Most and the flexibity of Perl on mocking subroutines that allows to test both procedural and OOP in a really easy way no matter there was not test before unlike PHP where mocking it’s limited to object methods.

Tons of the Perl cpan modules are packaged in GNU/Linux distributions so if you want to package your software to work in a GNU/Linux distribution it is not a so difficult task.

Perl also grants its developers a great UTF-8 optionally aware regex engine which helps to parse complex data in a easy manner and it’s the most common way to manipulate Strings in Perl.

The Perl object’s are just blessed references which allows your object to be a Hash (The most common since allows the common key/value internal scheme of an Object.), an Array, an Subroutine (To force encapsulation by having the real attributes in a more restricted scope.)

Perl anonymous subroutines inherit the scope allowing to do some functional programming tricks.

There are great libraries to work with files in Perl like https://metacpan.org/pod/Path::Tiny which allows a near natural language way to manipulate files without having to be so aware about file handles or directory handles as the default. (The default sometimes also plays well in scripts.)

The Perl command line debugger is simply great, allowing to discover bugs in a easy and reproducible way so you can share debugging sessions as plain text with other programmers.

The default variable $_ allows the programmer to not to be having to name a variable that is used in a very limited scope before stopping being useful.

The for syntax is really cool $object->doSomething for my $object (@objects) or it’s anonymous counterpart $_->doSomething for (@objects)

It’s not always needed to put parents on subroutine calls allowing the developer to type less when the scope is clear.

It is also great the “There is more than a way to do it” philosophy behind the language, allowing you to choose the best way to do for your problem.

When we talk about optimization we should first know for what we want to optimize like Speed, CPU, Memory Usage or code Legibility it’s not uncommon you have to trade one to have the other, for example a complex and extensive data structure can be stored in disk to save memory but accesing it will harm Speed, CPU usage and Legibility by increasing the latency because disk reads, increasing the CPU usage because the aditional cycles to retrieve the data when needed and Legibility by doing the program more complex.

It’s is great a language takes different problems have different needs and adapt itself to the multiple ways to tackle a problem a developer may have.

Categories
Perl XS

Perl niceness, building a XS module that acts like a Perl object.

I have been trying the last days to add Mega.nz support to Peertube-dl, it is not an easy task since Mega.nz implements encryption with AES in the files so the server owner is not able in theory to read the users files. (My opinion is anyway the server admins would be able to get access to those files by simply parsing the server logs.)

I have not success yet, because the lack of good Perl libraries and the bad Perl is for parsing binary data and then I thought, why not try to use XS to solve this problem as I did before with the Javascript interpreter need for Youtube?

So this I am trying, and in the process I did a little improvement in my way to write XS thanks to the learn about how to use typemaps to build more ideomatic for both Perl and C XS modules.

Let’s start the house by the roof, would not be great being able to write a XS module like this:

#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <stdio.h>

typedef struct helper {
    int value;
} * Peertube_DL_Mega_Helper;

MODULE = Peertube::DL::Mega::Helper  PACKAGE = Peertube::DL::Mega::Helper
PROTOTYPES: DISABLE

Peertube_DL_Mega_Helper
new(class, value)
    char *class
    int value
    CODE:
        RETVAL = malloc(sizeof (struct helper));
        RETVAL->value = value;
    OUTPUT:
        RETVAL

void
print(self)
    Peertube_DL_Mega_Helper self
    CODE:
        printf("%d\n", self->value);

void
DESTROY(self)
    Peertube_DL_Mega_Helper self
    CODE:
        free(self);

This code can be achieved with the help of the file typemap, with the help of https://perldoc.perl.org/perlxstypemap I discovered that, and the following typemap will look familiar to anybody which have read that document.

TYPEMAP
    Peertube_DL_Mega_Helper T_PTROBJ_SPECIAL

INPUT
T_PTROBJ_SPECIAL
    if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
        IV tmp = SvIV((SV*)SvRV($arg));
        $var = INT2PTR($type, tmp);
    } else {
        croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\");
    }

OUTPUT
T_PTROBJ_SPECIAL
    sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
           (void*)$var);

Of course some more additions will be needed to make this code run, like the clasical xsloader Perl module and some lines in the Makefile.PL which is out of scope in this document.

We will be able to run this code like this and check it works:

perl -MPeertube::DL::Mega::Helper -e 'Peertube::DL::Mega::Helper->new(3)->print';

It should print 3.

It is true XS is scaring because strange subroutines and magical variable names and tons of rare sections difficult to understand on the first read of the docs, but this kind of sugar makes my day, it is a shame there aren’t much XS articles and Stackoverflow responses because makes the language slower to learn.

Categories
Real languages

Real languages evolution: Are we in the road to create a programmers language?

Featured image took shamelessly from https://imgs.xkcd.com/comics/linguists.png feel free to read some of his kind comics.

After reading a post talking about the English segmentation between geographical parts of the Earth I came to think about what other kind of barriers could exists which could allow English to evolve independently and I thought professions and digital communities could be this barrier.

Let’s think about it a great amount of today programmers and system administrators are not native english speakers and many of them write English conveying the way of writting they saw in documentation which may not have been written by a native English speaker, the result is mutations of common English would be not so rare to be created and extended between the informatics community in a daily basics.

It is not uncommon to hear about someone who heard a conversation between two programmers and could not understand a single word because not having the abstract vocabulary we use.

There are more examples I can think about my hypotesis like the Bussiness English based on giving confidence about your products and organizations or the Gamers English plaged by acronyms like wtf, lol, gg, gges, etc.

Also a special technical english is teached for students of many fields like biology as a teacher of me during the ESO explained to me.

When are we going to found ourselves saying “This is not Latin anymore”?