Dirk Eddelbuettel: RcppAnnoy 0.0.7

Planet Debian - Tue, 17/11/2015 - 03:54

A new version of RcppAnnoy, our Rcpp-based R integration of the nifty Annoy library by Erik, is now on CRAN. Annoy is a small, fast, and lightweight C++ template header library for approximate nearest neighbours.

This release mostly just catches up with the Annoy release 1.6.2 of last Friday. No new features were added on our side.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: Elsewhere

Four Kitchens: Austin's Drupal 8 Launch Party Here we come!

Planet Drupal - Tue, 17/11/2015 - 03:36

Join us and the rest of the Austin community for a well-deserved par-tay! We have quite the party planned: including BBQ, a cake, a pinata and even a raffle.

Categories: Elsewhere

Vincent Fourmond: Purely shell way to extract a numbered line from a file

Planet Debian - Mon, 16/11/2015 - 22:31
I feel almost shameful to write it down, but as it took me a long time to realize this, I'll write it down anyway. Here's the simplest portable shell one-liner I've found to extract only the, say, 5th line from file:

~ cat file | tail -n +5 | head -n1

Hope it helps...

Categories: Elsewhere

Drupal @ Penn State: Cache warming authenticated sites with XMLRPC

Planet Drupal - Mon, 16/11/2015 - 21:56

This video talks through how XMLRPC Page Load and HTTPRL Spider can be used to warm caches on private / authenticated sites. XMLRPC Page Load provides a callback that tricks Drupal into thinking that it’s delivering a page to certain user account. It does this by simulating page delivery but then never actually writing the output anywhere.

Categories: Elsewhere

Drupal @ Penn State: The future of Drupal under the hood

Planet Drupal - Mon, 16/11/2015 - 21:56

I’m in the middle of several Drupal Camp / Con’s (any event over 1000 people is no longer a “Camp” but that’s for another time) and it’s occured to me: I can no longer learn by going. Now, this is learn in the traditional sense of what I used to go to Camps for (been coming to camps for 8 years now).

Categories: Elsewhere

Chromatic: TheaterMania: Lessons Learned on Localization

Planet Drupal - Mon, 16/11/2015 - 21:28

We recently launched a new site for an existing client, TheaterMania. We helped launch and currently maintain and develop The Gold Club, which is a subscription-based discount theater club in New York City. The new site is the same thing, but in London – same language, same codebase, new database, different servers. We only had to migrate users, which were already exported for us, so nothing exceptional there. Shouldn’t be a big deal, right? We learned that’s not always the case.

Architectural Decisions

One of our first problems, besides the obvious localization issues (currency, date formats, language), was to decide what we were shipping. Were we just building another site? Were we packaging software? There will most likely be more sites in other cities in the future – how far did we want to go in terms of making this a product that we could ship? In the end, we wound up going somewhere in the middle. We had to decide initially if we would use Organic Groups to have one site with multiple “clubs,” one Drupal multisite installation, or multiple Drupal installations. The final decision was to combine the latter two choices – we created multisite-style directories so that if we need to take the site in a multi-site direction, we can easily do that. The sites each have a site-specific settings file, full of various configuration variables.

Now that the site has been launched, we’re not sure if this list of variables will be developer-friendly moving forward, and have been keeping in mind that we may want a more elegant solution for this. The best part about this setup is that we have one codebase, one master branch, and each site is configured to use the appropriate settings. The most important thing is that this is all very thoroughly documented, both in the code, README files, and the repo wiki.

Currency & Recurly: Easier than Expected

One of the issues I thought would be very problematic was currency, but that wasn’t actually an issue. All of the existing transactions are set up in cents – ie, 100 instead of 1.00 for a dollar, and that translates perfectly from dollars to pounds. We use Recurly, an external payment and subscription processor, so we didn’t have to worry about any localization issues on that front. Most of the currency abstractions I did were to remove any hard-coded references to the dollar sign, and create functions and variables to get the appropriate currency symbol.

Dealing with Dates; Ugh.

Date formats were something I expected to be easy, but that wound up being more complex. I discovered hook_date_combo_process_alter() to change the display of the date in calendar popup fields. This made what I’d thought was going to be a difficult series of view handlers really simple. We have several fields using the date combo box on both content types and entities, and this function took care of them.

* Implements hook_date_combo_process_alter().
* Changes the date format.
function gc_display_date_combo_process_alter(&$element, &$form_state, $context) {
  if (isset($element['#entity']->type)) {
    switch ($element['#entity']->type) {
      case 'event':
        $element['value']['#date_format'] = variable_get('date_format_short');

      case 'partner':
        $element['value']['#date_format'] = variable_get('date_format_short');
        $element['value2']['#date_format'] = variable_get('date_format_short');

      case 'promo_offer':
        $element['value']['#date_format'] = variable_get('date_format_short');
        $element['value2']['#date_format'] = variable_get('date_format_short');

  elseif (isset($element['#entity']->field_name)) {
    if ($element['value']['#instance']['widget']['type']  'date_popup' && $element['#entity']->field_name  'field_user_csr_notes') {
      $element['value']['#date_format'] = variable_get('date_format_short');

I took the dozen or so existing date formats from Drupal, altered some of them to meet our needs, and added a few more. My head also started spinning when testing because I’m so used to M/D/Y formats that D/M/Y formats look really strange after a while, especially because code changes needed to be tested on the US and UK sites, so I had to be really careful when visually testing a page to make sure that a US page was showing 9/1/15 and the UK page was showing 1/9/15. In the future, I’d definitely advocate for a testing suite on a project like this. Overall, making sure all of the dates were changed was somewhat tedious, but not difficult. It required a lot of attention to detail and familiarity with PHP date formats, and vigorous testing by the whole team to make sure nothing had been missed.

Proper Use of t() Early == Wins Later

This project made me extremely grateful for the t() function. Since both sites were in English, we didn’t have a need for site-wide translation, but we did need to localize a handful of strings, both for language issues (words like ‘personalize’ vs ‘personalise’), and the general language preference of the stakeholders. It was easy enough to find the strings and list them in locale_custom_strings_en to switch them out. One gotcha we came across that I wasn’t familiar with – you cannot use t() in your settings files. The function isn’t available at that point in the bootstrapping. You can use get_t(), but we opted to remove the translation strings from any variables and make sure that t() was used when the variable was called. This wasn’t something I had run into before, and it caused some problems before we figured it out.


A few tricky miscellaneous problems cropped up, too. There was a geolocation function enabled in Recurly, which was defaulting to the US and we were unable to change the settings – we also didn’t realize this when testing in the US, and we scratched our heads when the London team told us the field was defaulting to US until we came across the culprit. We were able to fix it, and put in a patch for the library causing the issue.

I also realized how many various settings default to the US when working on this project – a lot of the location-related work was just abstracting out country defaults. Something to keep in mind if you’re working on a project with locations. Don’t make more work for developers who live or work on projects outside of the US. Plan for the future! Assume nothing!

Looking Back

I’m really glad that I worked on this project, because it’s made me develop with a better eye for abstraction of all kinds, and making sure that it’s easy for developers or users to work with my code anywhere. In the future, I’d put more thought into managing our configurations from the start, as well as automating the testing process, both for time-saving and better QA.

If you’ve ever worked on a site with challenges like these, I’d love to hear how you handled them! What are your best practices for managing custom locale strings and other site-specific variables? To what extent do you abstract things like dates and currency when developing a site, even when you don’t know if those will ever change?

Categories: Elsewhere

Steve Kemp: lumail2 nears another release

Planet Debian - Mon, 16/11/2015 - 21:15

I'm pleased with the way that Lumail2 development is proceeding, and it is reaching a point where there will be a second source-release.

I've made a lot of changes to the repository recently, and most of them boil down to moving code from the C++ side of the application, over to the Lua side.

This morning, for example, I updated the handing of index.limit to be entirely Lua based.

When you open a Maildir folder you see the list of messages it contains, as you would expect.

The notion of the index.limit is that you can limit the messages displayed, for example:

  • See all messages: Config:set( "index.limit", "all")
  • See only new/unread messages: Config:set( "index.limit", "new")
  • See only messages which arrived today: Config:set( "index.limit", "today")
  • See only messages which contain "Steve" in their formatted version: Config:set( "index.limit", "steve")

These are just examples that are present as defaults, but they give an idea of how things can work. I guess it isn't so different to Mutt's "limit" facilities - but thanks to the dynamic Lua nature of the application you can add your own with relative ease.

One of the biggest changes, recently, was the ability to display coloured text! That was always possible before, but a single line could only be one colour. Now colours can be mixed within a line, so this works as you might imagine:

Panel:append( "$[RED]This is red, $[GREEN]green, $[WHITE]white, and $[CYAN]cyan!" )

Other changes include a persistant cache of "stuff", which is Lua-based, the inclusion of at least one luarocks library to parse Date: headers, and a simple API for all our objects.

All good stuff. Perhaps time for a break in the next few weeks, but right now I think I'm making useful updates every other evening or so.

Categories: Elsewhere

Acquia Developer Center Blog: Open Sourcing Statsgod, a StatsD Implementation In Go

Planet Drupal - Mon, 16/11/2015 - 21:10
Kevin Hankens

Acquia Engineering is excited to be open-sourcing Statsgod, a reimplementation of StatsD we created internally to help scale our metrics collection effort.

Acquia developers often create tooling to build, deploy, and monitor applications we run on Amazon Web Services, and Statsgod is one such tool that we want to make publicly available. Statsgod was designed to be highly scalable and easily deployed.

Tags: acquia drupal planet
Categories: Elsewhere

DrupalOnWindows: Exposing reverse entity reference fields in Drupal

Planet Drupal - Mon, 16/11/2015 - 20:55
Language English

Entity references in Drupal is the mechanism used to do some "proper" (sorry for the quotes but what you can achieve with Drupal is years behind a real ORM such as the Entity Framework in terms of usability, reliability, flexibility and overal quality) data modeling without having to write everything from scratch including queries, widgets and storage. 

More articles...
Categories: Elsewhere

Daniel Pocock: Quick start using Blender for video editing

Planet Debian - Mon, 16/11/2015 - 19:53

Updated 2015-11-16 for WebM

Although it is mostly known for animation, Blender includes a non-linear video editing system that is available in all the current stable versions of Debian, Ubuntu and Fedora.

Here are some screenshots showing how to start editing a video of a talk from a conference.

In this case, there are two input files:

  • A video file from a DSLR camera, including an audio stream from a microphone on the camera
  • A separate audio file with sound captured by a lapel microphone attached to the speaker's smartphone. This is a much better quality sound and we would like this to replace the sound included in the video file.
Open Blender and choose the video editing mode

Launch Blender and choose the video sequence editor from the pull down menu at the top of the window:

Now you should see all the video sequence editor controls:

Setup the properties for your project

Click the context menu under the strip editor panel and change the panel to a Properties panel:

The video file we are playing with is 720p, so it seems reasonable to use 720p for the output too. Change that here:

The input file is 25fps so we need to use exactly the same frame rate for the output, otherwise you will either observe the video going at the wrong speed or there will be a conversion that is CPU intensive and degrades the quality. Also check that the resolution_percentage setting under the picture dimensions is 100%:

Now specify output to PNG files. Later we will combine them into a WebM file with a script. Specify the directory where the files will be placed and use the # placeholder to specify the number of digits to use to embed the frame number in the filename:

Now your basic rendering properties are set. When you want to generate the output file, come back to this panel and use the Animation button at the top.

Editing the video

Use the context menu to change the properties panel back to the strip view panel:

Add the video file:

and then right click the video strip (the lower strip) to highlight it and then add a transform strip:

Audio waveform

Right click the audio strip to highlight it and then go to the properties on the right hand side and click to show the waveform:

Rendering length

By default, Blender assumes you want to render 250 frames of output. Looking in the properties to the right of the audio or video strip you can see the actual number of frames. Put that value in the box at the bottom of the window where it says 250:

Enable AV-sync

Also at the bottom of the window is a control to enable AV-sync. If your audio and video are not in sync when you preview, you need to set this AV-sync option and also make sure you set the frame rate correctly in the properties:

Add the other sound strip

Now add the other sound file that was recorded using the lapel microphone:

Enable the waveform display for that sound strip too, this will allow you to align the sound strips precisely:

You will need to listen to the strips to make an estimate of the time difference. Use this estimate to set the "start frame" in the properties for your audio strip, it will be a negative value if the audio strip starts before the video. You can then zoom the strip panel to show about 3 to 5 seconds of sound and try to align the peaks. An easy way to do this is to look for applause at the end of the audio strips, the applause generates a large peak that is easily visible.

Once you have synced the audio, you can play the track and you should not be able to hear any echo. You can then silence the audio track from the camera by right clicking it, look in the properties to the right and change volume to 0.

Make any transforms you require

For example, to zoom in on the speaker, right click the transform strip (3rd from the bottom) and then in the panel on the right, click to enable "Uniform Scale" and then set the scale factor as required:

Render the video output to PNG

Click the context menu under the Curves panel and choose Properties again.

Click the Animation button to generate a sequence of PNG files for each frame.

Render the audio output

On the Properties panel, click the Audio button near the top. Choose a filename for the generated audio file.

Look on the bottom left-hand side of the window for the audio file settings, change it to the ogg container and Vorbis codec:

Ensure the filename has a .ogg extension

Now look at the top right-hand corner of the window for the Mixdown button. Click it and wait for Blender to generate the audio file.

Combine the PNG files and audio file into a WebM video file

You will need to have a few command line tools installed for manipulating the files from scripts. Install them using the package manager, for example, on a Debian or Ubuntu system:

# apt-get install mjpegtools vpx-tools mkvtoolnix

Now create a script like the following:

#!/bin/bash -e # Set this to match the project properties FRAME_RATE=25 # Set this to the rate you desire: TARGET_BITRATE=1000 WORK_DIR=${HOME}/video1 PNG_DIR=${WORK_DIR}/frames YUV_FILE=${WORK_DIR}/video.yuv WEBM_FILE=${WORK_DIR}/video.webm AUDIO_FILE=${WORK_DIR}/audio-mixed.ogg NUM_FRAMES=`find ${PNG_DIR} -type f | wc -l` png2yuv -I p -f $FRAME_RATE -b 1 -n $NUM_FRAMES \ -j ${PNG_DIR}/%08d.png > ${YUV_FILE} vpxenc --good --cpu-used=0 --auto-alt-ref=1 \ --lag-in-frames=16 --end-usage=vbr --passes=2 \ --threads=2 --target-bitrate=${TARGET_BITRATE} \ -o ${WEBM_FILE}-noaudio ${YUV_FILE} rm ${YUV_FILE} mkvmerge -o ${WEBM_FILE} -w ${WEBM_FILE}-noaudio ${AUDIO_FILE} rm ${WEBM_FILE}-noaudio Next steps

There are plenty of more comprehensive tutorials, including some videos on Youtube, explaining how to do more advanced things like fading in and out or zooming and panning dynamically at different points in the video.

If the lighting is not good (faces too dark, for example), you can right click the video strip, go to the properties panel on the right hand side and click Modifiers, Add Strip Modifier and then select "Color Balance". Use the Lift, Gamma and Gain sliders to adjust the shadows, midtones and highlights respectively.

Categories: Elsewhere

Pantheon Blog: Better Behavior-Driven Development on Remote Servers

Planet Drupal - Mon, 16/11/2015 - 18:48
Behavior-Driven Development is a widely-used testing methodology that is used to describe functional tests—that is, tests that operate on the whole of a system—in natural, readable language called Gherkin syntax. The goal of this methodology is to make the contents of the tests approachable to non-technical stakeholders. This makes it possible for a project’s functional tests to be meaningfully used as the acceptance criteria for the product.
Categories: Elsewhere

Red Route: How to add classes to links in Drupal 8

Planet Drupal - Mon, 16/11/2015 - 17:48

As I start porting the modules I maintain to Drupal 8, I'm hitting a few places where things haven't been intuitive to me. I'll try to work on the drupal.org documentation when I get a chance, but in the meantime I figured it would be worth writing up a few notes.

A common task is creating a link, and adding classes and other attributes to it. The Responsive Share Buttons is basically just a block of links to social networks, so this was a key building block.

In Drupal 7 this was pretty simple - the link building function took three arguments - a title, a path, and an array of options:

$link = l(t('Link Title'), 'http://drupal.org', array(
  'attributes' => array(
    'class' => array(

In Drupal 8, the l function now takes a Url object with attributes, rather than a string, so it's a little different. Here's how to build a link to an external URL and add a class to it: First the Url class needs to be brought into scope:

use Drupal\Core\Url;

And then you can build the Url object and call setOptions on it:

$url = Url::fromUri('http://drupal.org');
$link_options = array(
  'attributes' => array(
    'class' => array(
$link = \Drupal::l(t('Link title'), $url);

Incidentally, the other gotcha here that had me scratching my head for a while was how to get the current page title, and how to get the current URL. Drupal 7 had easily accessible functions for these tasks, but the object-oriented approach

Drupal 7 $title = drupal_get_title();
$current_url = url(current_path(), array('absolute' => TRUE)); Drupal 8 $request = \Drupal::request();
$route_match = \Drupal::routeMatch();
$title = \Drupal::service('title_resolver')->getTitle($request, $route_match->getRouteObject());
$current_url = $request->getUri();

My own learning journey with Drupal 8 is very much in its early days, and a lot of the old Drupalisms are pretty familiar to me, but it does seem a little long-winded. Give it a while, and I'm sure I'll get up to speed, and start seeing the benefits of the object-oriented approach in Drupal 8.

Tags: Drupaldrupal 8
Categories: Elsewhere

Zivtech: oEmbed in Drupal: Embed all the things!

Planet Drupal - Mon, 16/11/2015 - 17:44

​WordPress has great support for oEmbed, allowing content creators to paste in URLs that are automatically displayed as rich embedded content. You may also be familiar with similar behavior on Facebook and in chat services like Slack. Meanwhile in Drupal 7, most sites are using Media module with their WYSIWYG and are able to (with some effort) embed from certain providers. In Drupal 8, we finally have WYSIWYG in core, but no solution for adding videos and other embedded content. How can we have the ease of use of Wordpress for embedding 3rd party content?

oEmbed Module

The Drupal oEmbed module, despite its humble project description, works nicely, and can easily give you a similar experience as you get in Wordpress. I recommend signing up for an account with the Embed.ly service, and setting the cache lifetime high in the oEmbed module settings. This gives you access to a large number of oEmbed providers and for many sites if you use a high cache lifetime you can stay within the free usage tier. OEmbed module gives you the option of using an input filter to turn URLs in your textareas into embeds, or you can use oEmbed Field submodule, which allows you to add link fields and use an oEmbed display formatter.

Asset Module: How are you so awesome and so overlooked?

The ability to use oEmbed in a field got me thinking about one of my favorite (and highly underrated) modules: Asset module. Asset module is essentially an alternative to the widely-used Media module (Scald is a third option in this space). While Media currently boasts 262,680 site installs, Asset is used by a humble 1,167. Drupalers are often advised that a good way to tell which module is the best when choosing between similar modules is to pick based on usage statistics and how much active development is occurring. Unfortunately, this is not foolproof advice: beware echo chambers.

If you've worked with Media module much (disclaimer: I was involved in early stages of Media module architecture and development), you're probably familiar with some of its flaws: an ever-changing variety of complex bugs on its 2.x branch, complicated relationship between Media and File Entity configurations, no straightforward method to add captions to images, multiple dialogs to click through just to add an image, bugs when you disable and re-enable rich text, and difficulty editing items after you add them to the WYSIWYG, to name a few.

Asset module in contrast has a lovely UI, provides common features out of the box (add an image to the WYSIWYG with working captions and right/left alignment), is simple to configure and use, relatively bug-free, and stable. It provides many of the same features as Media, like a library of reusable media assets you can add to a WYSIWYG or display in Views and the ability to add your own fielded bundles for various types of assets. In addition Asset module lets you pick your own WYSIWYG button icons and have a separate button in your WYSIWYG for each type of asset (image, video, document) and unlike Media module it is not directly tied to files. This means you can create Asset types for reusable, centrally-managed structured content that are not file-based at all. I like to make Asset types for things like Addresses and Calls to Action which authors can use within their WYSIWYG. You can quickly explore the wonders of Asset module on its demo site - make sure in addition to the WYSIWYG buttons you try out the 'Asset Widget' on the right side of the content creation page and see how you can drag existing assets into not only textareas but also entityreference fields.

oEmbed with Asset Module

What does this have to do with oEmbed? Well, guess what happens if you add a new Asset type with a link field you set to display as oEmbed? Yup, now you have an Embed button on your WYSIWYG that lets your authors paste in a URL from any of those services, or reuse embeds they've already added to their Asset library. No more adding separate modules to be able to integrate with YouTube, Vimeo, and more. In fact, now we have a better user experience than WordPress! The embeds even show up already rendered right in your WYSIWYG.

Here are some examples of embeds I can put into this WYSIWYG (content from myself and my old band from around the internet):

A song on Rdio

A video on YouTube

A tweet

.@tizzo at work pic.twitter.com/MMymGtT5mn

— Jody Hamilton (@JodyHamilton) June 28, 2015

A photo on Flickr

A LinkedIn user

A Github gist

A JibJab

Editor Experience

Want to see how it looks in my WYSIWYG? Let me embed a screenshot with my Asset image button!

My WYSIWYG right now... Like my caption?

The buttons on the right in my WYSIWYG are for Assets. I have a Document, Image, 'Call to Action', and then Embed Asset Types, followed by the Search button that lets me use my Asset library. By the way, the 'Call to Action' is just a link field that outputs like:

Get Asset Module!

When I press the Embed button, I embed an asset like

To add an embed, just paste in a URL. Note you pick your Asset button - here I'm using a heart because I heart this setup. You can also add your own icons (patch in the queue).

Please tune in for Part 2 of this series on how to set this up on your Drupal 7 site and Part 3: Embedding in Drupal 8.

Terms: Publishing Workflow Ready for Publishing
Categories: Elsewhere

Drupal Camp NJ 2015: Announcing Mike Anello as the Keynote for DrupalCamp NJ 2016!

Planet Drupal - Mon, 16/11/2015 - 17:35

Mike Anello (@ultimike) is co-founder and vice president of DrupalEasy, a

Categories: Elsewhere

Pronovix: Retooling on Drupal 8: free training materials

Planet Drupal - Mon, 16/11/2015 - 16:55

We are working on a set of free training materials for Drupal 8. To make sure we build something that others will be able to reuse we would like to get your input on the kind of trainings you would like to use to retrain your team.

Categories: Elsewhere

Drupalpress, Drupal in the Health Sciences Library at UVA: Setting up Shibboleth + Ubuntu 14 + Drupal 7 on AWS with Virginia.edu integration

Planet Drupal - Mon, 16/11/2015 - 16:48

We’ve recently begun moving to amazon web services for hosting, however we still need to authenticate through ITS who handles the central SSO Authentication services for Virginia.edu.  In previous posts we looked at Pubcookie aka Netbadge - however Pubcookie is getting pretty long in the tooth (it’s last release back in 2010) and we are running Ubuntu 14 with Apache 2…. integrating pubcookie was going to be a PITA…. so it was time to look at Shibboleth – an Internet2  SSO standard that works with SAML  and is markedly more modern than pubcookie – allowing federated logins between institutions etc…

A special thanks to Steve Losen who put up with way more banal questions than anyone should have to deal with… that said, he’s the man

Anyhow – ITS does a fine job at documenting the basics - http://its.virginia.edu/netbadge/unixdevelopers.html.  Since we’re using ubuntu the only real difference is that we used apt-get

Here’s the entire install from base Ubuntu 14

apt-get install apache2 mysql-server php5 php-pear php5-mysql php5-ldap libapache2-mod-shib2 shibboleth-sp2-schemas drush sendmail ntp


Apache Set up

On the Apache2 side  we enabled some modules and the default ssl site

a2enmod ldap rewrite  shib2 ssl
a2ensite default-ssl.conf

Back on the apache2 side here’s our default SSL 

<IfModule mod_ssl.c>
<VirtualHost _default_:443>
ServerAdmin webmaster@localhost
ServerName bioconnector.virginia.edu:443
DocumentRoot /some_web_directory/bioconnector.virginia.edu
<Directory /some_web_directory/dev.bioconnector.virginia.edu>
AllowOverride All

SSLEngine on

SSLCertificateFile /somewheresafe/biocon_hsl.crt
SSLCertificateKeyFile /somewheresafe/biocon_hsl.key

<Location />
AuthType shibboleth
ShibRequestSetting requireSession 0 ##This part meant that creating a session is possible, not required
require shibboleth

the location attributes are important – if you don’t have that either in the Apache conf you’ll need it in an .htaccess in the drupal directory space

Shibboleth Config

The Shibboleth side confused me for a hot minute.

we used  shib-keygen as noted in the documentation to create keys for shibboleth and ultimately the relevant part of our /etc/shibboleth/shibboleth2.xml looked like this

<ApplicationDefaults entityID=”https://www.bioconnector.virginia.edu/shibboleth”
REMOTE_USER=”eppn uid persistent-id targeted-id”>

<Sessions lifetime=”28800″ timeout=”3600″ relayState=”ss:mem”
checkAddress=”false” handlerSSL=”true” cookieProps=”https”>
<!–we went with SSL Required – so change handlerSSL to true and cookieProps to https

<SSO entityID=”urn:mace:incommon:virginia.edu”>
<!–this is the production value, we started out with the testing config – ITS provides this in their documentation–>

<MetadataProvider type=”XML” file=”UVAmetadata.xml” />
<!–Once things are working you should be able to find this at https://www.your-virginia-website.edu/Shibboleth/Metadata – it’s a file you download from ITS = RTFM –>
<AttributeExtractor type=”XML” validate=”true” reloadChanges=”false” path=”attribute-map.xml”/>
<!–attribute-map.xml is the only other file you’re going to need to touch–>

<CredentialResolver type=”File” key=”sp-key.pem” certificate=”sp-cert.pem”/>
<!–these are the keys generated with shib-keygen –>
<Handler type=”Session” Location=”/Session” showAttributeValues=”true”/>
<!–During debug we used https://www.bioconnector.virginia.edu/Shibboleth.sso/Session with the  showAttributeValues=”true” setting on to see what was coming across from the UVa  Shibboleth IdP–>

/etc/shibboleth/attribute-map.xml looked like this

<Attribute name=”urn:mace:dir:attribute-def:eduPersonPrincipalName” id=”eppn”>
<AttributeDecoder xsi:type=”ScopedAttributeDecoder”/>

<Attribute name=”urn:mace:dir:attribute-def:eduPersonScopedAffiliation” id=”affiliation”>
<AttributeDecoder xsi:type=”ScopedAttributeDecoder” caseSensitive=”false”/>
<Attribute name=”urn:oid:″ id=”affiliation”>
<AttributeDecoder xsi:type=”ScopedAttributeDecoder” caseSensitive=”false”/>

<Attribute name=”urn:mace:dir:attribute-def:eduPersonAffiliation” id=”unscoped-affiliation”>
<AttributeDecoder xsi:type=”StringAttributeDecoder” caseSensitive=”false”/>
<Attribute name=”urn:oid:″ id=”unscoped-affiliation”>
<AttributeDecoder xsi:type=”StringAttributeDecoder” caseSensitive=”false”/>

<Attribute name=”urn:mace:dir:attribute-def:eduPersonEntitlement” id=”entitlement”/>
<Attribute name=”urn:oid:″ id=”entitlement”/>

<Attribute name=”urn:mace:dir:attribute-def:eduPersonTargetedID” id=”targeted-id”>
<AttributeDecoder xsi:type=”ScopedAttributeDecoder”/>

<Attribute name=”urn:oid:″ id=”persistent-id”>
<AttributeDecoder xsi:type=”NameIDAttributeDecoder” formatter=”$NameQualifier!$SPNameQualifier!$Name” defaultQualifiers=”true”/>

<!– Fourth, the SAML 2.0 NameID Format: –>
<Attribute name=”urn:oasis:names:tc:SAML:2.0:nameid-format:persistent” id=”persistent-id”>
<AttributeDecoder xsi:type=”NameIDAttributeDecoder” formatter=”$NameQualifier!$SPNameQualifier!$Name” defaultQualifiers=”true”/>
<Attribute name=”urn:oid:″ id=”eduPersonPrincipalName”/>
<Attribute name=”urn:oid:0.9.2342.19200300.100.1.1″ id=”uid”/>

Those two pieces marked in red are important – they’re going to be the bits that we pipe in to Drupal

For  debugging we used the following URL https://www.bioconnector.virginia.edu/Shibboleth.sso/Session to see what was coming across – once it was all good we got a response that looks like

Session Expiration (barring inactivity): 479 minute(s)
Client Address:
SSO Protocol: urn:oasis:names:tc:SAML:2.0:protocol
Identity Provider: urn:mace:incommon:virginia.edu
Authentication Time: 2015-11-16T15:35:39.118Z
Authentication Context Class: urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport
Authentication Context Decl: (none)

affiliation: staff@virginia.edu;employee@virginia.edu;member@virginia.edu
eduPersonPrincipalName: adp6j@virginia.edu
uid: adp6j
unscoped-affiliation: member;staff;employee

The uid and eduPersonPrincipalName variables being the pieces we needed to get Drupal to set up a session for us

Lastly the Drupal bit

The Drupal side of this is pretty straight

We installed Drupal as usual  and grabbed the shib_auth module.


and on the Advanced Tab

Categories: Elsewhere

Julien Danjou: Profiling Python using cProfile: a concrete case

Planet Debian - Mon, 16/11/2015 - 16:00

Writing programs is fun, but making them fast can be a pain. Python programs are no exception to that, but the basic profiling toolchain is actually not that complicated to use. Here, I would like to show you how you can quickly profile and analyze your Python code to find what part of the code you should optimize.

What's profiling?

Profiling a Python program is doing a dynamic analysis that measures the execution time of the program and everything that compose it. That means measuring the time spent in each of its functions. This will give you data about where your program is spending time, and what area might be worth optimizing.

It's a very interesting exercise. Many people focus on local optimizations, such as determining e.g. which of the Python functions range or xrange is going to be faster. It turns out that knowing which one is faster may never be an issue in your program, and that the time gained by one of the functions above might not be worth the time you spend researching that, or arguing about it with your colleague.

Trying to blindly optimize a program without measuring where it is actually spending its time is a useless exercise. Following your guts alone is not always sufficient.

There are many types of profiling, as there are many things you can measure. In this exercise, we'll focus on CPU utilization profiling, meaning the time spent by each function executing instructions. Obviously, we could do many more kind of profiling and optimizations, such as memory profiling which would measure the memory used by each piece of code – something I talk about in The Hacker's Guide to Python.


Since Python 2.5, Python provides a C module called cProfile which has a reasonable overhead and offers a good enough feature set. The basic usage goes down to:

>>> import cProfile
>>> cProfile.run('2 + 2')
2 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

Though you can also run a script with it, which turns out to be handy:

$ python -m cProfile -s cumtime lwn2pocket.py
72270 function calls (70640 primitive calls) in 4.481 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.004 0.004 4.481 4.481 lwn2pocket.py:2(<module>)
1 0.001 0.001 4.296 4.296 lwn2pocket.py:51(main)
3 0.000 0.000 4.286 1.429 api.py:17(request)
3 0.000 0.000 4.268 1.423 sessions.py:386(request)
4/3 0.000 0.000 3.816 1.272 sessions.py:539(send)
4 0.000 0.000 2.965 0.741 adapters.py:323(send)
4 0.000 0.000 2.962 0.740 connectionpool.py:421(urlopen)
4 0.000 0.000 2.961 0.740 connectionpool.py:317(_make_request)
2 0.000 0.000 2.675 1.338 api.py:98(post)
30 0.000 0.000 1.621 0.054 ssl.py:727(recv)
30 0.000 0.000 1.621 0.054 ssl.py:610(read)
30 1.621 0.054 1.621 0.054 {method 'read' of '_ssl._SSLSocket' objects}
1 0.000 0.000 1.611 1.611 api.py:58(get)
4 0.000 0.000 1.572 0.393 httplib.py:1095(getresponse)
4 0.000 0.000 1.572 0.393 httplib.py:446(begin)
60 0.000 0.000 1.571 0.026 socket.py:410(readline)
4 0.000 0.000 1.571 0.393 httplib.py:407(_read_status)
1 0.000 0.000 1.462 1.462 pocket.py:44(wrapped)
1 0.000 0.000 1.462 1.462 pocket.py:152(make_request)
1 0.000 0.000 1.462 1.462 pocket.py:139(_make_request)
1 0.000 0.000 1.459 1.459 pocket.py:134(_post_request)

This prints out all the function called, with the time spend in each and the number of times they have been called.

Advanced visualization with KCacheGrind

While being useful, the output format is very basic and does not make easy to grab knowledge for complete programs. For more advanced visualization, I leverage KCacheGrind. If you did any C programming and profiling these last years, you may have used it as it is primarily designed as front-end for Valgrind generated call-graphs.

In order to use, you need to generate a cProfile result file, then convert it to KCacheGrind format. To do that, I use pyprof2calltree.

$ python -m cProfile -o myscript.cprof myscript.py
$ pyprof2calltree -k -i myscript.cprof

And the KCacheGrind window magically appears!

Concrete case: Carbonara optimization

I was curious about the performances of Carbonara, the small timeserie library I wrote for Gnocchi. I decided to do some basic profiling to see if there was any obvious optimization to do.

In order to profile a program, you need to run it. But running the whole program in profiling mode can generate a lot of data that you don't care about, and adds noise to what you're trying to understand. Since Gnocchi has thousands of unit tests and a few for Carbonara itself, I decided to profile the code used by these unit tests, as it's a good reflection of basic features of the library.

Note that this is a good strategy for a curious and naive first-pass profiling. There's no way that you can make sure that the hotspots you will see in the unit tests are the actual hotspots you will encounter in production. Therefore, a profiling in conditions and with a scenario that mimics what's seen in production is often a necessity if you need to push your program optimization further and want to achieve perceivable and valuable gain.

I activated cProfile using the method described above, creating a cProfile.Profile object around my tests (I actually started to implement that in testtools). I then run KCacheGrind as described above. Using KCacheGrind, I generated the following figures.

The test I profiled here is called test_fetch and is pretty easy to understand: it puts data in a timeserie object, and then fetch the aggregated result. The above list shows that 88 % of the ticks are spent in set_values (44 ticks over 50). This function is used to insert values into the timeserie, not to fetch the values. That means that it's really slow to insert data, and pretty fast to actually retrieve them.

Reading the rest of the list indicates that several functions share the rest of the ticks, update, _first_block_timestamp, _truncate, _resample, etc. Some of the functions in the list are not part of Carbonara, so there's no point in looking to optimize them. The only thing that can be optimized is, sometimes, the number of times they're called.

The call graph gives me a bit more insight about what's going on here. Using my knowledge about how Carbonara works, I don't think that the whole stack on the left for _first_block_timestamp makes much sense. This function is supposed to find the first timestamp for an aggregate, e.g. with a timestamp of 13:34:45 and a period of 5 minutes, the function should return 13:30:00. The way it works currently is by calling the resample function from Pandas on a timeserie with only one element, but that seems to be very slow. Indeed, currently this function represents 25 % of the time spent by set_values (11 ticks on 44).

Fortunately, I recently added a small function called _round_timestamp that does exactly what _first_block_timestamp needs that without calling any Pandas function, so no resample. So I ended up rewriting that function this way:

def _first_block_timestamp(self):
- ts = self.ts[-1:].resample(self.block_size)
- return (ts.index[-1] - (self.block_size * self.back_window))
+ rounded = self._round_timestamp(self.ts.index[-1], self.block_size)
+ return rounded - (self.block_size * self.back_window)

And then I re-run the exact same test to compare the output of cProfile.

The list of function seems quite different this time. The number of time spend used by set_values dropped from 88 % to 71 %.

The call stack for set_values shows that pretty well: we can't even see the _first_block_timestamp function as it is so fast that it totally disappeared from the display. It's now being considered insignificant by the profiler.

So we just speed up the whole insertion process of values into Carbonara by a nice 25 % in a few minutes. Not that bad for a first naive pass, right?

Categories: Elsewhere

Drupal Commerce: Contributor Spotlight: Joël Pittet

Planet Drupal - Mon, 16/11/2015 - 15:53
Say hi. (who are you and what do you do in the Commerce ecosystem)

Hi:) My name is Joël Pittet and I’m out of Vancouver, BC, Canada. I offered to help co-maintain commerce_discount and a few other Commerce modules as well as likely involved in messing about with patches all over Commerce ecosystem.

How did you get involved with contributing to Drupal Commerce?

Started working on a Drupal Commerce project, noticed things could use some fixing up and jumped in the deep end. I was recognized for helping triage the commerce queue in a fervor to fix all the things.

Categories: Elsewhere

Drupal Easy: DrupalEasy Podcast 164 - Dentistry (Paul Johnson - Drupal Social Media)

Planet Drupal - Mon, 16/11/2015 - 15:18
Download Podcast 164

Paul Johnson (pdjohnson) joins Mike Anello and Ted Bowman to talk about Drupal's social media presence, how community members can get involved, and the forthcoming release of Drupal 8!

read more

Categories: Elsewhere

Jim Birch: No more View pages

Planet Drupal - Mon, 16/11/2015 - 11:00

Views has long been one of the magic pieces that makes Drupal my CMS of choice.  Views allows us to easily create queries of content in the UI, giving great power to the site builder. 

When you first create a view, the default, obvious choice is to create a "Page" display of the view.  A Page has a URL that people can visit to see the information, and gets us as site builders closer to job done.  However, I don't want you to do it!

When you first create a view, the options are that you can make a Page and a Block.  Selecting neither will allow you to create a "Master" display, and additional modules can hook it and add addtional displays for your view.  In the screenshot below, you see we have additional displays of Attachment, Content pane, Context, and Feed in addition to the Block and Page displays.

All of our sites already have some sort of "Page" content type, for basic content of the site.  In this page content type, we add fields, set meta descriptions, get added to the xml sitemap, and include the pages in Drupal's core search.  When you create a view page, we only get the output as a url, we miss the benefit of having a "Page" node at that url.

Read more

Categories: Elsewhere


Subscribe to jfhovinne aggregator - Elsewhere