Chris Lamb: Web scraping: Let's move on

Planet Debian - Wed, 07/01/2015 - 19:53

Every few days, someone publishes a new guide, tutorial, library or framework about web scraping, the practice of extracting information from websites where an API is either not provided or is otherwise incomplete.

However, I find these resources fundamentally deceptive — the arduous parts of "real world" scraping simply aren't in the parsing and extraction of data from the target page, the typical focus of these articles.

The difficulties are invariably in "post-processing"; working around incomplete data on the page, handling errors gracefully and retrying in some (but not all) situations, keeping on top of layout/URL/data changes to the target site, not hitting your target site too often, logging into the target site if necessary and rotating credentials and IP addresses, respecting robots.txt, target site being utterly braindead, keeping users meaningfully informed of scraping progress if they are waiting of it, target site adding and removing data resulting in a null-leaning database schema, sane parallelisation in the presence of prioritisation of important requests, difficulties in monitoring a scraping system due to its implicitly non-deterministic nature, and general problems associated with long-running background processes in web stacks.

In other words, extracting the right text on the page is the easiest and trivial part by far, with little practical difference between an admittedly cute jQuery-esque parsing library or even just using a blunt regular expression.

It would be quixotic to simply retort that sites should provide "proper" APIs but I would love to see more attempts at solutions that go beyond the superficial.

Categories: Elsewhere

Blink Reaction: Part Three: Getting your Site onto the VM

Planet Drupal - Wed, 07/01/2015 - 19:16

In the last post, we got our VM up and running. Now we need to configure a hostname for it, as well as upload our site to the VM so we can start developing!

Configuring your Host OS

There’s one more step you need to perform in order for VDD to function properly on your system. You need to modify your machine’s hosts file so that you can visit your new VM by hostname instead of by typing in an IP Address.

Categories: Elsewhere

Aten Design Group: How to Easily Create Drupal Webforms in Code

Planet Drupal - Wed, 07/01/2015 - 16:10

Drupal webforms are useful in a variety of contexts, but the most typical context is something like a contact form: user-facing functionality that needs to exist when a site launches, and be easily edited by a site owner post-launch. In that context, webforms should be created automatically for a smooth, predictable launch. There are a few ways you can do that, including the Webform Features module, the Universally Unique IDentifier (UUID) module or custom code, maybe following documentation on Drupal.org.

When making webforms on a recent site, none of these options appealed to me. I wanted to manage webforms in code pre-launch, then hand them to a content editor to manage (outside code) post-launch. The Features-based options for creating webforms were okay pre-launch, but would add overhead post-launch. And creating a webform node from scratch seemed overly complicated to manage pre-launch. So I wrote the interface I wanted for creating and managing Drupal webforms, and it's now in the Config in Code (CINC) module for anyone to use.

Here's the example linked above from Drupal.org, implemented in this new CINC-based approach:

$webform = CINC::init('Webform')->machine_name('Contact Us');   $components = array();   $components[] = CINC::init('WebformComponent')->set('form_key', 'gender') ->set('type', 'select') ->set('mandatory', 1) ->set('extra.items', "Mrs|Mrs\nMiss|Miss\nMr|Mr") ->set('extra.aslist', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'name') ->set('name', 'Last name') ->set('mandatory', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'first_name') ->set('mandatory', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'city');   $components[] = CINC::init('WebformComponent')->set('form_key', 'country') ->set('type', 'select') ->set('extra.options_source', 'countries') ->set('extra.aslist', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'email_address') ->set('type', 'email') ->set('mandatory', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'subject') ->set('type', 'select') ->set('extra.items', "s1|Subject 1\nother|Other") ->set('extra.aslist', 1) ->set('mandatory', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'message') ->set('type', 'textarea') ->set('mandatory', 1);   $components[] = CINC::init('WebformComponent')->set('form_key', 'mandatory_fields') ->set('type', 'markup') ->set('value', '<p>Fields with * are mandatory</p>') ->set('extra.format', 'full_html');   foreach ($components as $index => $component) { $webform->add_component( $component->set('weight', $index * 5)->set('extra.title_display', 'inline') ); }   $webform->add_email('somebody@example.tld');   $webform->create();

The line count on that (52) is less than a third of the non-CINC example on Drupal.org (170), and did not require any time clicking around in a browser to create and export the webform. The code is also far more readable than both a Features export and starting from scratch, which makes it more maintainable. You may look at that "city" component and think I left something out, but that's really the entire code needed for a textfield with a name matching its form_key. Sensible defaults are nice.

As an added bonus, the CINC interface can also be used to read, update, and delete existing webforms. So if you need your Drupal webforms in code and Features isn't the best option for some reason, I invite you to enjoy the ease of creating webforms programmatically with CINC.

Categories: Elsewhere

Jonathan Brown: Using HD Bitcoin wallets with Drupal Coin Tools

Planet Drupal - Wed, 07/01/2015 - 14:10

Previously: Drupal / Bitcoin BIP 70 / PKI certificates

Each Coin Tools payment needs its own Bitcoin address. This is necessary so that it is clear whether or not the payment has been completed. It is also important for preserving anonymity.

In order to participate in the Bitcoin network, a Drupal website must talk to a Bitcoin node. Currently Coin Tools utilises the reference implementation, bitcoind.

bitcoind has wallet functionality built in. In fact, it was originally released as a desktop wallet for Microsoft Windows. By default, bitcoind will pre-generate a pool of 100 pairs of addresses and corresponding private keys. This pool will be increased as necessary.

This presents a number of problems. If data-loss were to occur on the server, the private keys could be unrecoverable and therefore the funds stored on the addresses would be unspendable. If a hacker gains access to the server they could copy the keys and steal the funds. The private keys can be encrypted, but the password is exposed on the server when generating new keys and spending funds.

To solve these problems, key pairs could be pre-generated in a secure environment and then the public addresses uploaded to the server.

Logistically this is challenging. A much more robust solution to this problem is to use Hierarchical Deterministic Wallets as described in BIP 32 (with draft extensions in BIPs 43, 44 & 45).

HD wallets are composed of a tree of pairs of extended public (xpub) and extended private (xprv) keys derived from a single seed or mnemonic sentence. An xprv can generate its child xpubs and child xprvs. An xpub can only generate its child xpubs. Any extended key can be converted into it's non-extended variant that cannot generate children.

A non-extended public key can be converted into a payment address. A non-extended private key can be used to spend funds that are held on the payment address it is associated with.

An example extended public key is:

An example extended private key is:

Extended key pairs are also considered to be either hardened or non-hardened. One of the properties of extended keys is that if an attacker knows a non-hardened private key and the parent xpub, they are able to determine the parent xprv.

In situations where private keys are to be distributed, for example within a company, hardened derivation must be used to prevent other private keys at the same level from being determined.

A further property of extended keys is that xpubs are not capable of generating hardened child public keys at all. This is fine because in an untrusted environment (with only a non-hardened xpub) no private keys will be present.

Payment addresses in an HD wallet can be considered to be either internal or external. External addresses are used when funds are being paid into an account from outside the wallet. Internal addresses are used as change addresses.

The default wallet layout is shown below:

HD wallets have many use-cases and BIP 32 identifies several.

"Unsecure money receiver" is the use-case to solve the problem described in this blog post.

The idea is to maintain an HD wallet in a secure environment. An account would be created in this wallet for the purpose of receiving payments in a specific Coin Tools payment type. The xpub for external addresses from this account would then be exported and added to the configuration of the payment type within Drupal.

Despite the obvious complexity of HD wallets, the concept of creating an account for a specific person, organisation or reason and exporting the xpub is actually very simple. The key point is that only one authority should be making payments into an specific xpub, otherwise addresses would be used multiple times. Scanning for unused addresses would not be an effective strategy to prevent this. Stealth addresses could become a solution for allocation of payment addresses without an authority.

Coin Tools can "interrogate" the provided xpub. The results of this process will be displayed, including the first four addresses that can be generated from the xpub and the relative path of the next address Coin Tools will generate:

Every key pair in the wallet has a path specifying the indexes at each level in the hierarchy, for example M/44'/0'/0'/0/3. Absolute paths have either an m or M as their first component. Relative paths have an index as their first component. In the example above we can see that the xpub has a depth of 3, so relative paths start with describing the index at depth 4.

A ' or H character after an index in the path indicates that the index is actually i+231. This means that keys at this level have hardened derivation.

According to BIP 43 (draft), the index at level 1 should be the hardened index of the BIP that describes the layout of the hierarchy beneath it. In the example above it is 44, meaning that it is using the layout from BIP 44 instead of the default one from BIP 32.

Despite the fact that many wallets are now HD, support for exporting account xpubs is currently quite low. However, some HD wallets that do not allow xpubs to be exported for regular accounts will allow xpubs to be exported from multisig accounts, for example Coinkite. In the future Coin Tools will support generation of addresses from multisig xpubs.

The only wallets that I know of that will export an xpub from a non-multisig account are Wallet32 and Electrum.

Both these wallets export xpubs that allow derivation of both the external (0/i) and internal (1/i) addresses. This is useful for watching an account balance but means that an entity making the payments into the account has greater ability to spy on subsequent transactions than would otherwise be possible. Coin Tools is currently hard coded to use the relative path 0/i. This will need to be made configurable as Coinkite xpubs do not need any prefix on the index.

It is essential that the addresses generated by Coin Tools match those generated by the wallet otherwise the account will not receive the payments. The xpub in the previous screenshot was exported from a Wallet32 account. In the following screenshot we can see that the addresses displayed in Wallet32 are the same as those generated by Coin Tools:

In theory when an xpub is imported the addresses should be scanned to make sure the xpub has not been used before. However, bitcoind does not maintain the correct indexes to be able to quickly list transactions for arbitrary addresses. When Coin Tools needs to receive a payment on an address from an xpub, it adds it as a watch-only address using the "importaddress" bitcoind command. The "rescan" parameter is set to false which means that transactions that happened before the address was added are not detectable. If this parameter is set to true it can take many minutes even on an SSD to import each address.

Coin Tools uses the Drupal State API to maintain the next index for each xpub. If there are more unreceived payments in a row than the gap limit of the wallet software, the wallet will loose track of later payments. To avoid this happening, payments that expire should maybe have their addresses put in a pool for re-use. However this may cause a problem if someone records the payment address and then satisfies the payment at a later time after it has expired.

Of course, if a hacker gained access to the web server they could change an xpub to their own. This would mean that until the problem was detected and the service shut down the hacker would be receiving the funds instead of the intended recipient. While damaging, this would be nowhere near as bad as the total loss of a hot wallet.

In order to facilitate handling of HD wallets, Coin Tools was converted to use the BitWasp PHP Bitcoin library instead of Gogulski.

Categories: Elsewhere

Kristian Polso: Integrating Twitter feed to your Drupal site

Planet Drupal - Wed, 07/01/2015 - 08:00
Twitter API can be a major PITA sometimes, but luckily there are modules for Drupal that makes integrating it to your website easy.
Categories: Elsewhere

Drupal core announcements: Drupal core critical issues sprint in Princeton, Jan. 29 to Feb. 1

Planet Drupal - Wed, 07/01/2015 - 04:24
Start:  2015-01-29 (All day) - 2015-02-01 (All day) America/New_York Sprint Organizers:  pwolanin davidhernandez

Timed to coincide with the 4th DrupalCamp NJ and focusing on issues that were not addressed at the recent sprint in Ghent, Belgium.

The focus of this sprint will be resolving critical issues around menu, menu link, and routing issues in Drupal 8.

Dates: Wednesday, January 29 through Sunday, February 1
Location: Princeton University in Princeton, NJ. (See the camp website for details.)
Travel: From Newark Airport (EWR), a good option is the Newark Airport to Princeton Junction train.

Confirmed attendees for this area of focus include pwolanin, dawehner, kgoel, and mpdonadio. Additional attendees may include xjm, Wim Leers, effulgentsia, and beejeebus.

Most of the travel expenses for attendees to work on menu, menu link, and routing issues are being paid for by a grant from the new Drupal Association Drupal 8 Accelerate program.

Additionally, local participants plan to work on core issues related to finishing the "Classy" theme in core so that the base "Stark" theme lives up to its name and serves as a true blank slate of HTML.

We only have limited additional space available, so please contact pwolanin if you'd like to participate in the sprint. Everyone is welcome attend the camp (while tickets last!) and the Drupal mentoring and collaboration day on Feb 1.

Many of the expected attendees participated in person or remote at this past summer's Drupal 8 at the Jersey Shore sprint.

AttachmentSize jerseyshore-gallery1.jpg162.94 KB
Categories: Elsewhere

Dirk Eddelbuettel: RcppCNPy 0.2.4

Planet Debian - Wed, 07/01/2015 - 03:56

A new release of the RcppCNPy package is now on CRAN.

This release mostly solidifies and fixes things. Support for saving integer objects, which was expanded in release 0.2.3, was not entirely correct. Operations on big-endian systems were not up to snuff either.

Wush Wu helped in getting this right with very diligent testing and patching particularly on big-endian hardware. We also got a pull request from Romain to reflect better const correctness at the Rcpp side of things. Last but not least we obliged by the CRAN Maintainers to not assume one could call gzip from system() call because, well, you guessed it.

Changes in version 0.2.4 (2015-01-05)
  • Support for saving integer objects was not correct and has been fixed.

  • Support for loading and saving on 'big endian' systems was incomplete, has been greatly expanded and corrected, thanks in large part to very diligent testing as well as patching by Wush Wu.

  • The implementation now uses const iterators, thanks to a pull request by Romain Francois.

  • The vignette no longer assumes that one can call gzip via system as the world's leading consumer OS may disagree.

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is may be the best place to start a discussion. GitHub issue tickets are also welcome.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: Elsewhere

Capgemini Engineering: Drupal 8 in 2 steps

Planet Drupal - Wed, 07/01/2015 - 01:00

Drupal 8 is the latest version of Drupal, a modern, PHP 5.4-boasting, REST-capable, object-oriented powerhouse. The concepts are still the same as the previous versions but the approach is now different. Drupal 8 comes with a modern Object Oriented Programming (OOP) approach to most parts of the system thanks to the use of the Symfony2 framework.

I took part in the Drupalcon in Amsterdam and I enjoyed a number of really interesting talks about Drupal 8, among those ‘Drupal 8: The Crash Course’ realized and presented by Larry Garfield. In this post the idea is to recap few key points of his talk as I think they are important to fully understand the basics of this new Drupal version. In case you are interested you can also watch the full talk.

How do I define a module?

In Drupal 8 to define a module we need only a YAML (.info.yml) file:


name: D8 Test Module description: D8 Test Module type: module core: 8.x package: Custom

In Drupal 8 the .module file is not required anymore, so with only the .info.yml file the module is ready to be enabled.

How do I make a page?

Start creating a controller extending the ControllerBase class and return the output of the page:


namespace Drupal\d8_example_module\Controller; use Drupal\Core\Controller\ControllerBase; class D8ExampleModuleController extends ControllerBase { public function test_page($from, $to) { $message = $this->t('%from to %to', [ '%from' => $from, '%to' => $to, ]); return $message; } }

Once this is done, within the .routing.yml file we can define the path, the content, the title and the permissions:


d8_example_module.test_page: path: '/test-page/{from}/{to}' defaults: _content: 'Drupal\d8_example_module\Controller\D8ExampleModuleController::test_page' _title: 'Test Page!' requirements: _permission: 'access content' How do I make content themeable?

We still have the hook_theme() function to define our theme:


/** * Implements hook_theme(). */ function d8_example_module_theme() { $theme['d8_example_module_page_theme'] = [ 'variables' => ['from' => NULL, 'to' => NULL], 'template' => 'd8-theme-page', ]; return $theme; }

For the template page Drupal 8 uses Twig, a third-party template language used by many PHP projects. For more info about Twig have a look at Twig in Drupal 8. One of the cool parts of Twig is that we can do string translation directly in the template file:


<section> {% trans %} <strong>{{ from }}</strong> to <em>{{ to }}</em> {% endtrans %} </section>

And then we assign the theme to the page:


namespace Drupal\d8_example_module\Controller; use Drupal\Core\Controller\ControllerBase; class D8ExampleModuleController extends ControllerBase { public function test_page($from, $to) { return [ '#theme' => 'd8_example_module_page_theme', '#from' => $from, '#to' => $to, ]; } } How do I define a variable?

Drupal 8 has a whole new configuration system that uses human-readable YAML (.yml) text files to store configuration items. For more info have a look at Managing configuration in Drupal 8.

We define variables in config/install/*.settings.yml:


default_count: 3

The variables will be stored in the database during the installation of the module. We define the schema for the variables in config/schema/*.settings.yml:


d8_example_module.settings: type: mapping label: 'D8 Example Module settings' mapping: default_count: type: integer label: 'Default count' How do I make a form?

To create a form we extend a ConfigFormBase class:


namespace Drupal\d8_example_module\Form; use Drupal\Core\Form\ConfigFormBase; use Drupal\Core\Form\FormStateInterface; class TestForm extends ConfigFormBase { public function getFormId() { return 'test_form'; } public function buildForm(array $form, FormStateInterface $form_state) { $config = $this->config('d8_example_module.settings'); $form['default_count'] = [ '#type' => 'number', '#title' => $this->t('Default count'), '#default_value' => $config->get('default_count'), ]; return parent::buildForm($form, $form_state); } public function submitForm(array &$form, FormStateInterface $form_state) { parent::submitForm($form, $form_state); $config = $this->config('d8_example_module.settings'); $config->set('default_count', $form_state->getValue('default_count')); $config->save(); } }

Then within the .routing.yml file we can define the path, the content, the title and the permissions:


d8_example_module.test_form: path: /admin/config/system/test-form defaults: _form: 'Drupal\d8_example_module\Form\TestForm' _title: 'Test Form' requirements: _permission: 'configure_form'

We use another YAML file (.permissions.yml) to define permissions:


'configure_form': title: 'Access to Test Form' description: 'Set the Default Count variable'

We also use another YAML file (.links.menu.yml) to define menu links:


d8_example_module.test_form: title: 'Test Form' description: 'Set the Default Count variable' route_name: d8_example_module.test_form parent: system.admin_config_system How do I make a block?

To create a block we extend a ConfigFormBase class:


namespace Drupal\d8_example_module\Plugin\Block; use Drupal\Core\Block\BlockBase; /** * Test Block. * * @Block( * id = "test_block", * admin_label = @Translation("Test Block"), * category = @Translation("System") * ) */ class TestBlock extends BlockBase { public function build() { return [ '#markup' => $this->t('Block content...'), ]; } }

In this way the block is ready to be configured in the CMS (/admin/structure/block). Here is an example of a more complex block:

namespace Drupal\d8_example_module\Plugin\Block; use Drupal\Core\Block\BlockBase; use Drupal\Core\Form\FormStateInterface; /** * Test Block. * * @Block( * id = "test_block", * admin_label = @Translation("Test Block"), * category = @Translation("System") * ) */ class TestBlock extends BlockBase { public function defaultConfiguration() { return ['enabled' => 1]; } public function blockForm($form, FormStateInterface $form_state) { $form['enabled'] = [ '#type' => 'checkbox', '#title' => $this->t('Configuration enabled'), '#default_value' => $this->configuration['enabled'], ]; return $form; } public function blockSubmit($form, FormStateInterface $form_state) { $this->configuration['enabled'] = (bool)$form_state->getValue('enabled'); } public function build() { if ($this->configuration['enabled']) { $message = $this->t('Configuration enabled'); } else { $message = $this->t('Configuration disabled'); } return [ '#markup' => $message, ]; } } Structure of a module

The structure of a module should look like the example module d8_example_module:

d8_example_module | |- config | |- install | |- d8_example_module.setting.yaml | |- schema | |- d8_example_module.settings.yaml | |- src | |- Controller | |- D8ExampleModuleController.php | |- Form | |- TestForm.php | |- Plugin | |- Block | |- TestBlock.php | |- templates | |- d8-theme-page.html.twig | |- d8_example_module.info.yml | |- d8_example_module.links.menu.yml | |- d8_example_module.module | |- d8_example_module.permissions.yml | |- d8_example_module.routing.yml

Drupal 8 in 2 steps: Extend a base Class or implement an Interface and tell Drupal about it.

Download the example module

Drupal 8 in 2 steps was originally published by Capgemini at Capgemini on January 07, 2015.

Categories: Elsewhere

PreviousNext: Drupal Testing Roadmap

Planet Drupal - Tue, 06/01/2015 - 23:19

Recently the patch to bring Mink based testing to drupal core went green. As result of that Lee Rowlands (@larowlan), Nick Schuch (@wesome1989), Adam Hoenich (@djphenaproxima), and myself (@grom358) had a discussion to create a roadmap for improving testing in Drupal core. Here is what we discussed.

Categories: Elsewhere

Mediacurrent: Introducing the Mediacurrent Contrib Committee

Planet Drupal - Tue, 06/01/2015 - 22:21

After Mediacurrent's excellent retreat in October 2014 it was decided to set up some internal committees to help organize various company initiatives. Several of these committees were fairly straight forward - marketing, training, porting our corporate site to D8, etc, but I felt that one had been overlooked - a committee for organizing our contrib efforts.

Categories: Elsewhere

Drupal Watchdog: PHP and JavaScript Closures

Planet Drupal - Tue, 06/01/2015 - 22:17

PHP closures are pretty simple as they are barely more than syntactic sugar over the following:

class Something { function __construct($x) { $this->x = $x; } function __invoke($y) { extract(get_object_vars($this)); // Your closure here. } }


function something ($y) use ($x) { // Your closure here. }

So closures are objects with a small difference: they are automatically constructed and once constructed they can not be changed and the only thing you can do with them is call them. Now it should be easy to see how variables work: variables given in use() are copied to properties on the object. If $x is an object itself then of course only its handler is copied so changing from inside the closure affects it everywhere else, exactly like how objects work in any other operation. All this is quite consistent on how PHP works and relatively simple to understand.

function foo() { $x = 1; $y = function() use (&$x) { $x++; print "in $x\n"; }; $y(); print "$x\n"; return $y; } $func = foo(); $func(); print "$x\n";

JavaScript is just a little different. First of all, there is no explicit import, every variable from the parent scope is imported. Second, since everything is an object, changing these variables affects the variables in the parent scope.

function foo() { var x = 1; var y = function() { x++; console.log('in' + x);} y(); console.log(x); return y; } func = foo(); func(); console.log(x); Flow Control

In both languages returning from a closure will simply return to the caller. If the closure is called in a loop then the loop will continue. Short of throwing an exception the closure can’t stop such a loop. See Smalltalk for an example of a language where this is different. Obviously, Common Lisp can do both kinds of returns and the syntax is succinct and easy to understand. Obviously again, Ruby can do both and the syntax is extremely obscure.

About $this / this

Since PHP 5.4, you can use $this in closure. Just imagine that one is passed in via use() and everything will be fine. So $this always means the object it is defined in even if the closure is passed to another method on another object. If necessary then a new closure can be created with a new $this variable: Closure::bind($closure, $newthis) or $closure->bindTo($newthis):

class foo { protected $x = 1; function bar() { return function() { $this->x++; print "$this->x\n";}; } } $func = (new foo)->bar(); $func(); class bar { protected $x = 10; } $func2 = $func->bindTo(new bar, "bar"); // "bar" allows the closure to access protected things $func2();

JavaScript this means the defining scope however it can be changed when calling the closures via the call or apply methods of the closure. This doesn't have a PHP equivalent. Your favorite framework or native DOM handling will often do this for you. each in jQuery sets this to the current object, event handlers will get the current event in this etc. ES5.1 in 2011 introduced the bind method on function objects which behaves exactly like bindTo in PHP: something.bind(newThis) returns a new closure with this being set to newThis. Examples:

function foo() { x = 1; var y = function() { this.x++; console.log(this.x);} y(); return y; } var func = foo(); func(); var func = function() { this.x++; console.log(this.x);} func.call({x:1}); o = {x:10}; var func2 = func.bindTo(o); func2(); Tags:  PHP JavaScript Closures
Categories: Elsewhere

Drupal core announcements: Ghent critical issue sprint recap

Planet Drupal - Tue, 06/01/2015 - 22:08

Last month, 13 sprinters gathered in Ghent, Belgium for a very focused sprint designed to accelerate work on issues blocking the release of Drupal 8. The sprint was a great burst of momentum for the core critical queue -- we went from 115 critical issues at the start of the sprint to only 81 as of today. That means we have 30% fewer critical issues than we did a month ago.

During the 5-day sprint, we worked on an impressive 51 critical issues, 28 of which are already fixed. Of particular note are the 18 upgrade path blockers that we moved forward (that's 70% of the issues blocking a beta-to-beta upgrade path during that time).

Sprint goals: Accomplished!

Before the sprint, we set some goals for the progress we wanted to make on upgrade path blockers for the Entity Field API, Configuration system, and Views. Here's how we did on each of these goals:

Views data structure and Entity Field API integration

We decoupled Views' entity field data from the SQL table structure by storing entity field information in the view configuration. This will make it possible for Views to detect when the entity field schema has changed and respond to the changes (as well as allowing better support for non-SQL databases). We also defined the entity schema changes that Views will need to support, and work is underway to support them.

Content and configuration dependencies in Views

We added content and configuration dependencies to Views so that Views that use entity display modes, field formatters, user roles, and so on can be safely deployed. We also discussed how to store deployable references to entities (for example, the taxonomy term displayed at the top of taxonomy/term/* listings) based on the shared needs of Views and Entity Reference. A patch to implement the proposed API in Views is nearly ready.

Global settings.php overrides

We had a fruitful discussion that clarified the problem space and culminated in splitting the issue into two steps. We retitled and resummarized the original issue into the second step, and began work on the first step. While getting an initial patch for the first step to pass tests, we uncovered several blocker issues, each of which has now been committed. The patch for this issue is now up for review.

Configuration schema

All hidden configuration schema issues are now fixed and will not regress, because all tests now have strict schema checking enabled by default!
To help people get started with config schemas, Gábor Hojtsy also created a very handy cheat sheet that provides the most crucial information at a glance.

Data integrity on module uninstallation

The two critical bugs in this problem space are now fixed:

To implement both of the above, we created a new ModuleUninstallValidatorInterface. We also have a non-critical issue to better integrate those validators when a module is being uninstalled as part of a configuration import.

NOT NULL constraints for entity base fields

Thanks to fast collaboration between plach, amateescu, yched, and fago (which was greatly assisted by having them all together in-person), a patch that fixes the fatal error bug has now been committed. This required resolving some trickiness with entity reference fields, whose target_id property is simultaneously required but not known while in the process of referencing a not-yet-saved entity. The solution results in a more semantically correct API and better delineation of responsibilities between field types, field definitions, and storage handlers for identifying and implementing required-ness.

The UN of Chocolate

We worked hard at the sprint, but also managed to fit in some international chocolate comparisons, with Swiss, Hungarian, and Belgian sweets to power all that coding. Contributor pfrenssen also pledged not to shave until 8.0.0 (1 cm of beard per beta?), and Berdir shared just how brimming with criticals his issue tracker became. We even learned a bit about the history of Ghent, thanks to swentel and his father-in-law.


The sprint was sponsored by the Drupal Association and Wunderkraut.

The following organizations also contributed their employees' time to participate in the sprint:

Finally, thanks to all the sprinters: alexpott, amateescu, Berdir, bircher, dawehner, fago, Gábor Hojtsy, pfrenssen, plach, swentel, Wim Leers, xjm, and yched!

What's next?

With the record-breaking productivity of our sprint, we know that more sprints like these will help get Drupal 8 done. The Drupal Association's D8 Accelerate progam will include more critical issue sprints in 2015. Watch for an upcoming sprint on menu and routing criticals at DrupalCamp New Jersey later this month!

Categories: Elsewhere

Marzee Labs: Blueprinting Drupal projects

Planet Drupal - Tue, 06/01/2015 - 20:00

Planning the structure of a Drupal project is important. At Marzee Labs, we've developed some pretty robust methodologies over time to approach new Drupal projects, and in this post we'll outline some of these tools and processes that help us get off the ground in no time. While some of the topics are probably familiar (Drupal makefiles, installation profiles and such) you might learn some new tips and tricks to make your next Drupal project just that tiny bit more automated and run smooth.

The blueprint of any Drupal project: the makefile

Any project we start has to have a makefile. Full stop. Requiring that every module, library or theme we use - be it from drupal.org, github, or any other source - is documented in a single file, is a great way to quickly get the gist of any drupal project.

Even though you might want to version your contributed modules (more on this below), the Drupal makefile should form the backbone of your website.

As an example, check out the makefiles of our MZ profile, our boilerplate profile that can be used to kickstart a new project. For a Drupal profile that can be contributed and packaged on drupal.org, we typically have 3 different Makefiles, but now we’re only interested in mz.make.

Here are the instructions to make the link module.

projects[link][version] = 1.3 projects[link][subdir] = "contrib"

Everyone inspecting the site running this profile now knows that you are using the 1.3 version of the Link module.

Need to patch the link module because you encountered a bug or missing functionality? Sure thing. First we scan the drupal.org issue queue for patches. An example is this issue, with a working patch. We add this to our makefile, with a one-liner comment and a link to the issue on d.o.

projects[link][version] = 1.3 projects[link][subdir] = "contrib" ; Provide the original_url when loading the field. ; @see https://www.drupal.org/node/1475790#comment-7743415 projects[link][patch][] = "http://www.drupal.org/files/7.x-1.x-_link_sanitize-bandaid-1475790-16.diff"

And we rebuild our project to test the patch in our Drupal sandbox, passing --projects=link (and also --no-core since we don’t want to rebuild Drupal core)

drush make profiles/mz/mz.make --projects=link --no-core .

Or we download the nifty Drush Patchfile to apply patches directly and work with a patch file (our makefile, in this case).

If you want to use the latest development version of a module, you can also do that. If you do however, always specify the revision hash as well (you find it in the commit log), so you make sure you’re working with that specific development release that you tested.

projects[link][version] = 1.x-dev projects[link][subdir] = "contrib" projects[link][revision] = 7dc306c

Feel the power of this? You can quickly evaluate community contributed patches, roll your own (and contribute them as a Gist if they don’t fit on drupal.org), and not be dependent anymore on the module maintainers to publish that new release.

Since you also document every patch used, you’re making this knowledge available to the other developers in the team, to the reviewer of your Pull Request (if you are using the github branching model), or generally as part of the Git history of your project. You can often revisit your makefile to remove patches if they’ve been rolled out in a new release, and update your modules. Make this a habit and it will pay off eventually.

Bundling using profiles

All your custom code and modules, themes and libraries to be installed should be bundled as an installation profile, so your site can be installed over and over.

If you haven’t started organizing your sites as Drupal profiles, you probably should. Have a look at our boilerplate MZ profile. We use it to bundle our favourite contributed modules, but it also has some custom features and part of our common worfklows we find often useful. Other examples of great Drupal profiles are Commerce Kickstart or Drupal Commons.

Organizing your code in a git repository

It’s time to dive in the organization of your git repository (we love GitHub). If you’ve followed along so far, contributed code is documented in your makefile, while your custom features and code lives in your profile. It would be enough to version these, and that is the recommended way if you want to package your code as a profile or a base profile upon which to build new sites.

However, most of the time, you will also need to deploy off this repository directly, so we suggest that you store all your code - including Drupal core & contrib - in the git repository. Your directory structure could look like

profiles/mz/modules/contrib profiles/mz/modules/custom profiles/mz/modules/features profiles/mz/libraries profiles/mz/themes ... sites/default/settings.php sites/default/settings.prod.php sites/default/settings.test.php sites/default/settings.dev.php ... index.php ... README.md

We also store settings.php in the git repository, and include an if statement to load the right settings.phpdepending on the environment that is available.

Another advantage of versioning environment-dependent settings is that you can force certain variables to be set in code, e.g. for production you might want to add

// Caching settings $conf['page_cache_without_database'] = TRUE; $conf['page_cache_invoke_hooks'] = FALSE;

to your settings.prod.php, making all these settings directly available in git and thus for review by your peers, and it avoids having to wonder what setting is active on which environment

Finally, create a settings.local.php file that is loaded from settings.php, with your local database settings.

// For local development if (file_exists('./sites/default/settings.local.php')) { include_once('./sites/default/settings.local.php'); } What’s next?

Now that we have given an overview of our favourite project architecture in Drupal (makefiles, profiles, and github), some of the next topics we’d like to talk about our setting up a continuous integration pipeline (using Travis CI), writing a couple of Behat tests to assert your site works fine, and reviewing our worfklows with Github (pull requests, issues, releases).

Did you miss our “Coding as a Team” series? Check out pt.1: automation workflow using Phing, pt. 2: using content fixtures and pt. 3: code workflow.

Featured image credit: brianbutko / Flickr

Categories: Elsewhere

Steve McIntyre: Bootstrapping arm64 in Debian

Planet Debian - Tue, 06/01/2015 - 18:03

I promised to write about this a long time, ooops... :-)

Another ARM port in Debian - yay!

arm64 is officially a release architecture for Jessie, aka Debian version 8. That's taken a lot of manual porting and development effort over the last couple of years, and it's also taken a lot of CPU time - there are ~21,000 source packages in Debian Jessie! As is often the case for a brand new architecture like arm64 (or AArch64, to use ARM's own terminology), hardware can be really difficult to get hold of. In time this will cease to be an issue as hardware becomes more commoditised, but in Debian we really struggled to get hold of equipment for a very long time during the early part of the port.

First bring-up in Debian Ports

To start with, we could use ARM's own AArch64 software models to build the first few packages. This worked, but only very slowly. Then Chen Baozi and the folks running the Tianhe-2 supercomputer project in Guangzhou, China contacted us to offer access to some arm64 hardware, and this is what Wookey used for bootstrapping the new port in the unofficial Debian Ports archive. This has now become the normal way for new architectures to get into Debian. We got most of the archive built in debian-ports this way, and we could then use those results to seed the initial core set of packages in the main Debian archive.

Second bring-up - moving into the main Debian archive

By the time that first Debian bring-up was done, ARM was starting to produce its own "Juno" development boards, and with the help of my boss^4 James McNiven we managed to acquire a couple of those machines for use as official Debian build machines. The existing machines in China were faster, but for various reasons quite difficult to maintain as official Debian machines. So I set up the Junos as buildds just before going to DebConf in August 2014. They ran very well, and (for dev boards!) were very fast and stable. They built a large chunk of the Debian archive, but as the release freeze for Jessie grew close we weren't quite there. There was a small but persistent backlog of un-built packages that were causing us issues, plus the Juno machines are/were not quite suitable as porter boxes for Debian developers all over the world to use for debugging their packages on the new architecture.

More horsepower - Linaro machines

This is where Linaro came to our aid. Linaro's goal is to help improve Free and Open Source Software on ARM, and one of the more recent projects in Linaro is a cluster of servers that are made available for software developers to use to get early access to ARMv8 (arm64) hardware. It's a great way for people who are interested in this new architecture to try things out, port their software or indeed just help with the general porting effort.

As Debian is seen as such an important part of the FLOSS ecosystem, we managed to negotiate dedicated access to three of the machines in that cluster for Debian's use and we set those up in October, shortly before the freeze for Jessie. Andy Doan spent a lot of his time getting these machines going for us, and then I set up two of them as build machines and one as the porter box we were still needing.

With these extra machines available, we quickly caught up with the ever-busy "Needs-Build" queue and we've got sufficient build power now to keep things going for the Jessie release. We were officially added to the list of release architectures at the Cambridge mini-Debconf in November, and all is looking good now!

And in the future?

I've organised the loan of another arm64 machine from AMD for Debian to use for further porting and/or building. We're also expecting that more and more machines will be coming out soon as vendors move on from prototyping to producing real customer equipment. Once that's happened, more kit will be available and everybody will be able to have arm64-powered computers in the server room, on their desk and even inside their laptop! Mine will be running Debian Jessie... :-)


There's been a lot of people involved in the Debian arm64 bootstrapping at various stages, so many that I couldn't possibly credit them all! I'll highlight some, though. :-)

First of all, Wookey's life has revolved around this port for the last few years, tirelessly porting, fixing and hacking out package builds to get us going. We've had loads of help from other teams in Debian, particularly the massive patience of the DSA folks with getting early machines up and running and the prodding of the ftpmaster, buildd and release teams when we've been grinding our way through ever more package builds and dependency loops. We've also had really good support from toolchain folks in Debian and ARM, fixing bugs as we've found them by stressing new code and new machines. We've had a number of other people helping by filing bugs and posting patches to help us get things built and working. And (last but not least!) thanks to all the folks who've helped us beg and borrow the hardware to make the Debian arm64 port a reality.

Rumours of even more ARM ports coming soon are entirely scurrilous... *grin*

Categories: Elsewhere

Drupal core announcements: This Month in Drupal Documentation - December 2014

Planet Drupal - Tue, 06/01/2015 - 17:46

Here's an update from the Documentation Working Group (DocWG) on what has been happening in Drupal Documentation in the last month or so. Sorry... because this is posted in the Core group as well as Documentation, comments are disabled.

If you have comments or suggestions, please see the DocWG home page for how to contact us. Thanks!

Thanks for contributing!

Since November 28, (our previous TMIDD post), 232 contributors have made 733 total Drupal.org documentation page revisions, including 8 people that made more than 20 edits (lolandese, Francewhoa, webchick, kreynen, YesCT, Pierre.Vriens, Wim Leers, and PsyCode) -- thanks everyone! Most of these people are seasoned Drupal contributors, but PsyCode is a Google Code-In participant who has been editing pages. That's great -- we love it when our old friends come back to help, and when new people get involved!

In addition, there were many many commits to Drupal Core and contributed projects that improved documentation -- these are hard to count, because many commits combine code and documentation -- but they are greatly appreciated too!

Documentation Priorities

The Current documentation priorities page is always a good place to look to figure out what to work on, and has been updated recently.

Of special note: We're trying to get the Help pages inside Drupal 8 updated -- check the priorities page and events section below on this page for details.

If you're new to contributing to documentation, these projects may seem a bit overwhelming -- so why not try out a New contributor task to get started?

Upcoming Events Report from the Working Group

The Working Group meets monthly via Google Hangouts, and our next meeting is January 14. Contact Boris if you'd like to join the meeting. We're currently discussing how we and the community can address several big problems:

  • The lack of documentation for Drupal in languages other than English.
  • The difficulty of locating documentation that answers specific Drupal questions.
  • The overwhelming size of Drupal documentation for newcomers (not knowing where to start).
Categories: Elsewhere

Holger Levsen: 20150106-lts-december-2014

Planet Debian - Tue, 06/01/2015 - 17:34
My LTS December

In December 2014 I spent 11h on Debian LTS work and managed to get six DLAs released and another one almost done... I did:

  • Release DLA 103-1 which was previously prepared by Ben, Raphael and myself. So while for this release in December I only had to review one patch, I also had to build the package, provide prelimary .debs, ask for feedback, do some final smoke tests, write the announcement and do the upload. In total this still took 2.5h to "just release it"...
  • Doing DLA 114-1 for bind9 was rather straightforward,
  • As was DLA 116-1 for ntp, which I managed to release within one hour after the DSA for wheezy, despite me having to make the patch apply cleanly due to some openssl differences...
  • I mentioned the bit about openssl because noone ever made a mistake with such patches. Seriously, I mean: I would welcome a public review system for security fixes. We are all humans and we all make mistakes. I do think my ntp patching was safe, but... mistakes happen.
  • DLA 118-1 was basically "just" a new kernel update, which I almost released on my own, until (thankfully) Ben helped me wih one patch from .65 not applying (a fix for a wrong fix which Debian already had correctly fixed), which was due to a patch not correctly removed due to linenumber changes. And while I was still wrapping my head around applying+deapplying these very similar looking patches, Ben had already commited the fix. I'm quite happy with this sharing the work, due to the following benefits: a.) Ben can spend more time on important tasks and b.) the LTS user get more kernel security fixes faster.
  • DLA 119-1 for subversion was a rather straight forward take from the wheez DSAs again, I just had to make sure to also include the 2nd regression fixing DSA.
  • And then, I failed to finish my work on a jqueryui update before 31c3 started. And 31c3 really only ended yesterday when I helped putting stuff on trucks and cleaned the big hall... So that's also why I'm only writing this blog post now, and not two weeks ago, like I probably better had. Anyway, according to the security-tracker jqueryui is affected by two CVEs and that's wrong: CVE-2012-6662 does not affect the squeeze version. CVE-2010-5312 on the other hand affects the squeeze version, I know how to fix it, I just lacked a quiet moment to prepare my fix properly and test it, and so I've rather postponed doing so during 31c3... so, expect a DLA for jqeuryui very soon now!

Thanks to everyone who is supporting Squeeze LTS in whatever form! Even just expressing that you or the company or project you're working with is using LTS, is useful, as it's always nice to hear once work is used and appreciated. If you can contribute more, please do so. If you can't, that's also fine. It's free software after all

Categories: Elsewhere

Blink Reaction: Getting up and VMing

Planet Drupal - Tue, 06/01/2015 - 17:32

Last time we overviewed virtualization and its advantages for Drupal developing. It was all theory stuff, so I’m sure you’re itching to actually get a VM going. Let’s get started.

Categories: Elsewhere

Tiago Bortoletto Vaz: A few excerpts from The Invisible Committe's latest article

Planet Debian - Tue, 06/01/2015 - 16:29

Just sharing some points from "2. War against all things smart!" and "4. Techniques against Technology" by The Invisible Committee's "Fuck off Google" article. You may want to get the "Fuck off Google" pdf and watch that recent talk at 31C3.

"...predicts The New Digital Age, “there will be people who resist adopting and using technology, people who want nothing to do with virtual profiles, online data systems or smart phones. Yet a government might suspect that people who opt out completely have something to hide and thus are more likely to break laws, and as a counterterrorism measure, that government will build the kind of ‘hidden people’ registry we described earlier. If you don’t have any registered social-networking profiles or mobile subscriptions, and on-line references to you are unusually hard to find, you might be considered a candidate for such a registry. You might also be subjected to a strict set of new regulations that includes rigorous airport screening or even travel restrictions.”"

I've been introduced to following observations about 5 years ago when reading "The Immaterial" by André Gorz. Now The Invisible Committee makes that even clearer in a very few words:

"Technophilia and technophobia form a diabolical pair joined together by a central untruth: that such a thing as the technical exists. [...] Techniques can’t be reduced to a collection of equivalent instruments any one of which Man, that generic being, could take up and use without his essence being affected."

"[...] In this sense capitalism is essentially technological; it is the profitable organization of the most productive techniques into a system. Its cardinal figure is not the economist but the engineer. The engineer is the specialist in techniques and thus the chief expropriator of them, one who doesn’t let himself be affected by any of them, and spreads his own absence from the world every where he can. He’s a sad and servile figure. The solidarity between capitalism and socialism is confirmed there: in the cult of the engineer. It was engineers who drew up most of the models of the neoclassical economy like pieces of contemporary trading software."

Categories: Elsewhere

Gábor Hojtsy: 2014 in review from a multilingual Drupal perspective

Planet Drupal - Tue, 06/01/2015 - 15:30

Whew! 2014 was a fantastic year for the Drupal multilingual team. We had some great events with huge sprints, including but not limited to: Global Sprint Weekend, the amazing Drupal Dev Days Europe, NYC Camp, DrupalCon Austin, DrupalCon Amsterdam and BADCamp.

A fun fact about people on the multilingual team is that even though we usually turn out in big numbers at sprints, there are numerous great mentors among us, so we don't work on code that much at mentored sprints. We do a great job helping people get started and move into more serious core work though. Our most famous mentee this past year is 2014th Drupal 8 core contributor Holly Ross, Executive Director of the Drupal Association who contributed her first and second core patches fixing multilingual issues.

Categories: Elsewhere

Bálint Réczey: Kodi from Debian

Planet Debian - Tue, 06/01/2015 - 13:13

The well known XBMC Media Center has been renamed to Kodi with the 14.0 Helix release and following upstream’s decision the xbmc packages are renamed to kodi as well. Debian ships a slightly changed version of XBMC using the “XBMC from Debian” name and following that tradition ladies and gentlemen let me introduce you “Kodi from Debian”:

Kodi from Debian main screen

As of today Kodi from Debian uses the FFmpeg packages instead of the Libav ones which have been used by XBMC from Debian. The reason for the switch was upstream’s decision of dropping the Libav compatibility code and FFmpeg becoming available again packaged in Debian (thanks to Andreas Cadhalpun). It is worth noting that while upstream Kodi 14.0 downloads and builds FFmpeg 2.4.4 by default, Debian ships FFmpeg 2.5.1 already and FFmpeg under Kodi will be updated independently from Kodi thanks to the packaging mechanism.

The new kodi packages are uploaded to the NEW queue and are waiting for being accepted by the FTP Masters who are busy with preparing Jessie for the release (Many thanks to them for their hard work!), but in the meantime you can install kodi from https://people.debian.org/~rbalint/ppa/xbmc-ffmpeg/.

Happy recovery from the holidays!

Categories: Elsewhere


Subscribe to jfhovinne aggregator - Elsewhere