Chris Hall on Drupal 8: Drupal Site Builder role

Planet Drupal - lun, 16/11/2015 - 09:59
Drupal Site Builder role chrishu Mon, 11/16/2015 - 08:59
Catégories: Elsewhere

Wouter Verhelst: terrorism

Planet Debian - lun, 16/11/2015 - 09:07

noun | ter·ror·ism | \ˈter-ər-ˌi-zəm\ | no plural

The mistaken belief that it is possible to change the world through acts of cowardice.

They killed a lot of people, but their terrorism only intensified the people's resolve.

Catégories: Elsewhere

DrupalCon News: 7 Things You Must Experience in India

Planet Drupal - lun, 16/11/2015 - 01:47

India is shaped by countless influences, from centuries old civilizations to modern day technology. In its long journey, the country has absorbed many different cultures, which have given it different dimensions. You must experience some of these when you come for DrupalCon Asia 2016. Here is a list of seven unique experiences that will make your trip worth remembering.

Catégories: Elsewhere

Norbert Preining: Movies: Monuments Men and Interstellar

Planet Debian - lun, 16/11/2015 - 01:01

Over the rainy weekend we watched two movies: Monuments Men (in Japanese it is called Michelangelo Project!) and Interstellar. Both blockbuster movies from the usual American companies, they are light-years away when it comes to quality. The Monuments Men are boring, without a story, without depth, historically inaccurate, a complete failure. Interstellar, although a long movie, keeps you frozen in the seat while being as scientific as possible and starts your brain working heavily.

My personal verdict: 3 rotten eggs (because Rotten Tomatoes are not stinky enough) for the Monuments Men, and 4 stars for Interstellar.


First for the plot of the two movies: The Monuments Men is loosely based on a true story about rescuing pieces of art at the end of the second world war, before the Nazis destroy them or the Russian take them away. A group of art experts is sent into Europe and manages to find several hiding places of art taken by the Nazis.

Interstellar is set in near future where the conditions on the earth are deteriorating to a degree that human life seems to be soon impossible. Some years before the movie plays a group of astronauts were sent through a wormhole into a different galaxy to search for new inhabitable planets. Now it is time to check out these planets, and try to establish colonies there. Cooper, a retired NASA officer and pilot, now working as farmer, and his daughter are guided by some mysterious way to a secret NASA facility. Cooper is drafted for being a pilot on the reconnaissance mission, and leaves earth and our galaxy through the same wormhole. (Not telling more!)

Monuments Men

Looking at the cast of Monuments Men (George Clooney, Matt Damon, Bill Murray, John Goodman, Jean Dujardin, Bob Balaban, Hugh Bonneville, and Cate Blanchett) one would expect a great movie – but from the very first to the very last scene, it is a slowly meandering shallow flow of sticked together scenes without much coherence. Tension is generated only through unrelated events (stepping onto a landmine, patting a horse), but never developed properly. Dialogs are shallow and boring – with one exception: When Frank Stokes (George Clooney) meets the one German and inquires general about the art, predicting his future being hanged.

Historically, the movie is as inaccurate as it can be – despite Clooney stating that “80 percent of the story is still completely true and accurate, and almost all of the scenes happened”. That contrasts starkly with the verdict of Nigel Pollard (Swansea University): “There’s a kernel of history there, but The Monuments Men plays fast and loose with it in ways that are probably necessary to make the story work as a film, but the viewer ends up with a fairly confused notion of what the organisation was, and what it achieved.”

The movie leaves a bitter aftertaste, hailing of American heroism paired with the usual stereotypes (French amour, German retarded, Russian ignorance, etc). Together with the half baked dialogues it feels like a permanent coitus interruptus.


Interstellar cannot serve with a similar cast, but still a few known people (Matthew McConaughey, Anne Hathaway, and Michael Caine!). But I believe this is actually a sign of quality. Well balancing scientific accuracy and the requirements for blockbusters, the movie successfully spans the bridge between complicated science, in particular general gravity, and entertainment. While not going so far to call the move edutainment (like both the old and new Cosmos), it is surprising how much of hard science is packed into this movie. This is mostly thanks to the theoretical physicist Kip Thorne acting as scientific consultant for the movie, but also due to the director Christopher Nolan being serious about it and studying relativity at Caltech.

Of course, scientific accuracy has limits – nobody knows what happens if one crosses the event horizon of a black hole, and even the existence of wormholes is purely theoretical by now. Still, throughout the movie it follows the two requirements laid out by Kip Thorne: “First, that nothing would violate established physical laws. Second, that all the wild speculations… would spring from science and not from the fertile mind of a screenwriter.”

I think the biggest compliment was that, despite the length, despite a long day out (see next blog), despite the rather unfamiliar topic, my wife, who is normally not interested in space movies and that kind, didn’t fall asleep throughout the movie, and I had to stop several times to explain details of the theory of gravity and astronomy. So in some sense it was perfect edutainment!

Catégories: Elsewhere

Darren Mothersele: How to Survive Gentrification of the Drupal Community

Planet Drupal - lun, 16/11/2015 - 01:00

We're finally approaching the release of Drupal 8.0.0 on 19th Nov. The biggest achievement of the Drupal community to date. A complete rewrite of the core system to use modern object-oriented PHP. An effort that is often referred to as "getting off the island".

While the switch from Drupal 7 to Drupal 8 is a big change for developers, it is the result of a slow process of maturation of the Drupal community. Drupal 8 brings changes that will be welcomed by many, will bring in many new users, and of course, will push a few people out. How can we survive this "gentrification" of the Drupal community and prosper without losing touch with why we loved Drupal in the first place.


Cities all over the world are becoming more exclusive, more expensive, and a natural result of this is gentrification. It's contentious. Some see this as urban improvement, some as social cleansing.

I moved to London nearly 12 years ago. Dalston, to be precise. I was back in Dalston this weekend for a party, and it's very different to how I remember it from 2004. I compared the nice clean overground train to the unreliable and dirty Silverlink trains that used to run to Dalston. Then, walking down Kingsland road without being on guard. When I lived there in 2004 it was often cordoned off by police. The hipsters, the trendy coffee shops, and other obvious signs of gentrification proliferate.

Brixton was my home for many years, and I witnessed first hand the results of gentrification. I had an office space in Brixton, and decided to leave it when the landlord announced he was increasing the rent by 25%. I lived in several flats around Brixton over the years, and eventually moved (a bit) further south as rental prices in Brixton soared. I say this with tongue in cheek, well aware that to many I'd be seen as one of the gentrifiers! It's the communities that settled here during the 1940s and 1950s that gave the area it's eclectic multi-cultural feel. They're the ones who have been displaced, losing their homes and community as developers and "yuppies" take over.

Gentrification of the Drupal Community

I first used Drupal back in 2003, version 4 point something. It was fun. Hacky, but fun. I had to quickly get a site up for an event we were organising and Drupal offered a collaborative content model that set it apart from the other produces we evaluated.

I came back to Drupal in 2007 for another community site build, and Drupal 5 had been released. It was really fun. Yes, still very hacky, but it came with the power to build a CMS exactly the way I wanted it to work, and it came with an awesome community of other hackers. A community of dedicated open-source types, who valued openness, and working on projects for good. I was hooked and made the leap to full time Drupal development. Through Drupal I got involved in the first social innovation camp, and other tech-for-good type things.

Szeged 2008 was my first Drupalcon. 500 Drupal contributors and users in a small university town in Hungary. Everyone I met truly cared about making Drupal an awesome project and was contributing time and effort in any way they could. Several years later and Drupalcon have grown. 2000+ attendees in Barcelona this year, 2300+ in Amsterdam last year. But, as the community has grown, so has the commercial influence. With sales pitches as prevalent as learning sessions on the schedule.

One thing I noticed this year was that several sessions concluded, or included, a call for donations or funding to accelerate a particular module or project's development. The precedent was set in the starting session of the conference when the Drupal Association made an announcement about the Drupal 8 accelerate funding programme. I'm not saying this is a bad thing. If this is what it takes to get Drupal finished in today's conditions, then that's great. But, look at it as an indicator of how the community has changed, when compared to the sessions at Szeged seven years earlier. You would not have seen a call for quarter of a million dollar funding back then. Everyone was there because they loved it, not because they were being paid.

Hacking the hackers

While doing research for this post, I came across this brilliant essay, The hacker hacked, by Brett Scott about the gentrification of hacker culture. I quote his summary of the gentrification process:

Key to any gentrification process are successive waves of pioneers who gradually reduce the perceived risk of the form in question. In property gentrification, this starts with the artists and disenchanted dropouts from mainstream society who are drawn to marginalised areas. This, in turn, creates the seeds for certain markets to take root. A WiFi coffeeshop appears next to the Somalian community centre. And that, in turn, sends signals back into the mainstream that the area is slightly less alien than it used to be.

If you repeat this cycle enough times, the perceived dangers that keep the property developers and yuppies away gradually erode. Suddenly, the tipping point arrives. Through a myriad of individual actions under no one person’s control, the exotic other suddenly appears within a safe frame: interesting, exciting and cool, but not threatening. It becomes open to a carefree voyeurism, like a tiger being transformed into a zoo animal, and then a picture, and then a tiger-print dress to wear at cocktail parties. Something feels ‘gentrified’ when this shallow aesthetic of tiger takes over from the authentic lived experience of tiger.
-- Brett Scott

How does this relate to the Drupal community? Perhaps it starts with the NGOs and charities, our original flagship Drupal sites, that became our "artists and disenchanted dropouts from mainstream society". Then the big media companies move in as the "perceived dangers gradually erode". Eventually, The White House start using Drupal, and we're at home with the large enterprise clients and big corporate contracts.

As the Drupal project developed the requirements changed. Drupal's capabilities improve, and the Drupal user base and community advanced too.

This is evident in the development, and standardisation of things like configuration management. Something that was never an issue in the early days, as the community became more professional, solutions for configuration management were hacked together, and then became standardised.

Configuration management is just one example of the many benefits the Drupal community has experienced through the process of gentrification. There's also great test coverage, performance improvements, greater tooling, and many other advancements that came to Drupal as the community matured. Drupal became less about hacking and more about software engineering.

Drupal 8

Development on Drupal 8 started in March 2011 and four years later, is to set to be released on November 19, 2015. Over these years, Drupal has been rewritten, removing most of the pre-OO era PHP legacy.

Drupal's legacy was the "not invented here" mindset that became entrenched in the community through hacking solutions to extensibility into a language that was not designed to support it. And, a culture of not depending on third-party code due to early well publicised security issues with PHP extensions.

The move away from this legacy, the move to "get off the island", is a move towards more standardised, modern, development practises, and a move to embrace the wider PHP community.

Social cleansing

I mentioned before that gentrification is contentious. For some see it as urban improvement, some as social cleansing. Drupal and the Drupal community have clearly benefitted already, and it looks like prosperous times ahead for those who come along for the ride, and the newcomers who join and adopt Drupal.

But, what about the social cleansing. Will parts of the community be pushed out? Who gets left behind?

Drupal has suffered from an identity crisis. Because of it's flexibility, it's been used for many things. Drupal's openness to hacking, extending and ability to do just about anything, meant it was more than just a CMS. Over the years many talked about "small core", many used Drupal's core tools as a Framework, building apps and tools well beyond what a typical CMS would be used for.

Drupal 8 is a content management system.

Drupal 8 focuses on content management, on providing tools for non-technical users to build and manage sites. That's what it always wanted to be anyway.

Drupal 8 leverages the wider PHP community, in particular the Symfony components, as it's core. It no longer makes sense to see Drupal as a framework.

One of the parts of the community being displaced, are those using Drupal as a framework. If this is you then you may already be looking at a fork, like Backdrop, or playing with other frameworks, like the beautiful Laravel.

Another section of the community that may be displaced are those running Drupal on low end and shared hosting. Through the gentrification process, Drupal's requirements have increased. The increased hosting requirements have meant that dedicated Drupal platform hosting providers have emerged. More options for scalability and custom software stacks have taken precedent over solutions for smaller websites.

Drupal also potentially loses the innovators. Drupal always had a reputation for being cutting edge and innovative. As it moves to become the enterprise choice of open-source CMS, innovation becomes less important, and stability, security, and backwards compatibility become more important. The biggest innovations in Drupal (flexible content types and Views) date back to the 4.7 era. Views is now in core in Drupal 8. As Drupal matures further from this point, we'll probably see Drupal adopting innovations from other systems and ecosystems, rather than innovating on it's own. It's well placed to do this now, built on Symfony components, innovations from the wider community will be easier to integrate.

Surviving Gentrification Do you abandon the form, leave it to the yuppies and head to the next wild frontier? Or do you attempt to break the cycle, deface the estate-agent signs, and picket outside the wine bar with placards reading "Yuppies Go Home"?
-- Brett Scott

Or, do come along for the ride? Enjoy the benefits of gentrification, without losing the reason why you got involved in the first place?

If you're going to stick around then you're going to need change a few things. Here's 5 steps that will get you started:

1. Learn the foundations that Drupal is now built on.

If (like me) you've got a background in OO then this shouldn't be too hard. I did several years of post-graduate research into semantics and verification of object-oriented software. You definitely don't need to go that deep, but I would highly recommend getting to grips with classic works on design patterns such as Gang of Four and Martin Fowler.

With a basic understanding of the core "patterns" of object-oriented software, you start to appreciate how Symfony works.

Drupal, Silex, Laravel, Symfony Full Stack, Symfony CMF, phpBB, Joomla, Magento, Piwik, PHPUnit, Sonata, and many more projects are built on this same foundation. So, it's definitely worth learning, and Drupal can be a good way to learn it, while still working with a system you know well.

Try building a simple app with Silex.

Check out Drupalcon (and Laracon) on YouTube. There's some great stuff. Like this talk from Ryan Weaver about Symfony and this talk by Ross Tuck about Models and Service Layers.

2. Do PHP the right way.

PHP has changed. There's a lot of outdated information and a lot of legacy code. Drupal 8 has been rewritten to remove this legacy code, but there's still a lot of bad advice on how to write PHP out there. Read PHP The Right Way for a full guide on how modern PHP should be crafted.

3. Use Composer, use and create PHP packages.

Getting off the island, and embracing the wider PHP ecosystem means using Composer, and it's ecosystem of PHP packages. There are many more packages that are potentially compatible with Drupal, and by architecting your Drupal extensions as more general PHP packages you have access to a much wider pool of potential collaborators.

Creating PHP packages also forces you to write clean code, think like a software engineer, and write more maintainable, extensible, and reusable code. Check out The PHP League as examples of solid PHP packages. They have a good Skeleton starting package.

You may have made custom Drupal modules before. Try thinking about how you can refactor these into separate packages, and using the Drupal "module" as a small layer that integrates your logic with Drupal.

The SOLID principles will guide you towards creating good packages.

4. Use an IDE

This was a big one for me. I was always against using an IDE, burnt by early experiences with open-source IDEs. I settled on a customised Sublime Text setup, and various other apps. I didn't see much benefit over using one app for everything when I could combine a selection of my favorite apps to do the same thing.

I'm not sure why I stuck to this. I also do a lot of C++ programming. I have my own programming language (Cyril) for creating audio-reactive visuals. I use XCode for C++ as the debugging tools are essential when you're dealing with object graphs, memory management, and debugging pointer issues. So, why not use an IDE for my web development?

I tried PHPStorm and it's great. Far from the cumbersome experience I had in the early days with open-source IDEs, it offers a smooth, fast, integrated experience.

I think you can get away without an IDE when you're hacking on Drupal 7, but on an OO system like Drupal 8 you will need an IDE. You will need the integrated tooling, testing, and you'll be much more efficient with intelligent autocompletion, hinting, quick access to docs, and fast navigation of the huge codebase.

5. Identify your values and serve your purpose.

As the corporates, enterprises, and big businesses take over, it's important to remain true to your yourself. By identifying your values you will be well placed to notice when they are being compromised.

You probably got into open-source because you believe in the power of collaboration. But, this value of collaboration can often be at odds with the cut-throat corporate culture of competition.

To be aware of this is to be aware of the opportunity to spread openness and collaboration with our work.

As the proceeds of Drupal's success flow into the community, it's important to use this to do good. To continue to serve our communities and society as a whole. To enable collaboration, share our work, and use openness to build the world we want.

Final thoughts

The real opportunity, is to spread Drupal's values of cooperation to the wider population.

This is part of a bigger shift in society to adopt open-source values, principles, and methodologies. Chris Anderson says it best:

If the past ten years have been about discovering new social and innovation models on the Web, then the next ten years will be about applying them to the real world.
-- Chris Anderson

The Work Open Manifesto offers a useful formulation of what it means to be open that can apply beyond open source software: "Think Big, Start Small, Work Open".

Drupal is great case study for starting small, thinking big, and working openly.

The Drupal community has always has been transforming, improving ourselves, improving the product, improving our practises, and improving our tools.

Now it's time to think beyond Drupal, beyond the Drupal community, and to see Drupal's values of collaboration, teamwork, and openness spread through the wider community, society, and the world.

Catégories: Elsewhere

Manuel A. Fernandez Montecelo: Work on aptitude

Planet Debian - lun, 16/11/2015 - 00:44

Midsummer for me is also known as “Noite do Lume Novo” (literally “New Fire Night”), one of the big calendar events of the year, marking the end of the school year and the beginning of summer.

On this day, there are celebrations not very unlike the bonfires in the Guy Fawkes Night in England or Britain [1]. It is a bit different in that it is not a single event for the masses, more of a friends and neighbours thing, and that it lasts for a big chunk of the night (sometimes until morning). Perhaps for some people, or outside bigger towns or cities, Guy Fawkes Night is also celebrated in that way ─ and that's why during the first days of November there are fireworks rocketing and cracking in the neighbourhoods all around.

Like many other celebrations around the world involving bonfires, many of them also happening around the summer solstice, it is supposed to be a time of renewal of cycles, purification and keeping the evil spirits away; with rituals to that effect like jumping over the fire ─ when the flames are not high and it is safe enough.

So it was fitting that, in the middle of June (almost Midsummer in the northern hemisphere), I learnt that I was about to leave my now-previous job, which is a pretty big signal and precursor for renewal (and it might have something to do with purifying and keeping the evil away as well ;-) ).

Whatever... But what does all of this have to do with aptitude or Debian, anyway?

For one, it was a question of timing.

While looking for a new job (and I am still at it), I had more spare time than usual. DebConf 15 @ Heidelberg was within sight, and for the first time circumstances allowed me to attend this event.

It also coincided with the time when I re-gained access to commit to aptitude on the 19th of June. Which means Renewal.

End of June was also the time of the announcement of the colossal GCC-5/C++11 ABI transition in Debian, that was scheduled to start on the 1st of August, just before the DebConf. Between 2 and 3 thousand source packages in Debian were affected by this transition, which a few months later is not yet finished (although the most important parts were completed by mid-end September).

aptitude itself is written in C++, and depends on several libraries written in C++, like Boost, Xapian and SigC++. All of them had to be compiled with the new C++11 ABI of GCC-5, in unison and in a particular order, for aptitude to continue to work (and for minimal breakage). aptitude and some dependencies did not even compile straight away, so this transition meant that aptitude needed attention just to keep working.

Having recently being awarded again with the Aptitude Hat, attending DebConf for the first time and sailing towards the Transition Maelstrom, it was a clear sign that Something Had to Be Done (to avoid the sideways looks and consequent shame at DebConf, if nothing else).

Happily (or a bit unhappily for me, but let's pretend...), with the unexpected free time in my hands, I changed the plans that I had before re-gaining the Aptitude Hat (some of them involving Debian, but in other ways ─ maybe I will post about that soon).

In July I worked to fix the problems before the transition started, so aptitude would be (mostly) ready, or in the worst case broken only for a few days, while the chain of dependencies was rebuilt. But apart from the changes needed for the new GCC-5, it was decided at the last minute that Boost 1.55 would not be rebuilt with the new ABI, and that the only version with the new ABI would be 1.58 (which caused further breakage in aptitude, was added to experimental only a few days before, and was moved to unstable after the transition had started). Later, in the first days of the transition, aptitude was affected for a few days by breakage in the dependencies, due to not being compiled in sequence according to the transition levels (so with a mix of old and new ABI).

With the critical intervention of Axel Beckert (abe / XTaran), things were not so bad as they could have been. He was busy testing and uploading in the critical days when I was enjoying a small holiday on my way to DebConf, with minimal internet access and communicating almost exclusively with him; and he promptly tended the complaints arriving in the Bug Tracking System and asked for rebuilds of the dependencies with the new ABI. He also brought the packaging up to shape, which had decayed a bit in the last few years.

Gruesome Challenges

But not all was solved yet, more storms were brewing and started to appear in the horizon, in the form of clouds of fire coming from nearby realms.

The APT Deities, which had long ago spilled out their secret, inner challenge (just the initial paragraphs), were relentless. Moreover, they were present at Heidelberg in full force, in ─or close to─ their home grounds, and they were Marching Decidedly towards Victory:

In the talk @ DebConf “This APT has Super Cow Powers” (video available), by David Kalnischkies, they told us about the niceties of apt 1.1 (still in experimental but hopefully coming to unstable soon), and they boasted about getting the lead in our arms race (should I say bugs race?) by a few open bug reports.

This act of provocation further escalated the tensions. The fierce competition which had been going on for some time gained new heights. So much so that APT Deities and our team had to sit together in the outdoor areas of the venue and have many a weissbier together, while discussing and fixing bugs.

But beneath the calm on the surface, and while pretending to keep good diplomatic relations, I knew that Something Had to Be Done, again. So I could only do one thing ─ jump over the bonfire and Keep the Evil away, be that Keep Evil bugs Away or Keep Evil APT Deities Away from winning the challenge, or both.

After returning from DebConf I continued to dedicate time to the project, more than a full time job in some weeks, and this is what happened in the last few months, summarised in another graph, showing the evolution of the BTS for aptitude:

The numbers for apt right now (15th November 2015) are:

  • 629 open (731 if counting all merged bugs independently)
  • 0 Release Critical
  • 275 (318 unmerged) with severity Important or Normal
  • 354 (413 unmerged) with severity Minor or Wishlist
  • 0 marked as Forwarded or Pending

The numbers for aptitude right now are:

  • 488 (573 if counting all merged bugs independently)
  • 1 Release Critical (but it is an artificial bug to keep it from migrating to testing)
  • 197 (239 unmerged) with severity Important or Normal
  • 271 (313 unmerged) with severity Minor or Wishlist
  • 19 (20 unmerged) marked as Forwarded or Pending
The Aftermath

As we can see, for the time being I could keep the Evil at bay, both in terms of bugs themselves and re-gaining the lead in the bugs race ─ the Evil APT Deities were thwarted again in their efforts.

... More seriously, as most of you suspected, the graph above is not the whole truth, so I don't want to boast too much. A big part of the reduction in the number of bugs is because of merging duplicates, closing obsolete bugs, applying translations coming from multiple contributors, or simple fixes like typos and useful suggestions needing minor changes. Many of remaining problems are comparatively more difficult or time consuming that the ones addressed so far (except perhaps avoiding the immediate breakage of the transition, that took weeks to solve), and there are many important problems still there, chief among those is aptitude offering very poor solutions to resolve conflicts.

Still, even the simplest of the changes takes effort, and triaging hundreds of bugs is not fun at all and mostly a thankless effort ─ althought there is the occasionally kind soul that thanks you for handling a decade-old bug.

If being subjected to the rigours of the BTS and reading and solving hundreds of bug reports is not Purification, I don't know what it is.

Apart from the triaging, there were 118 bugs closed (or pending) due to changes made in the upstream part or the packaging in the last few months, and there are many changes that are not reflected in bugs closed (like most of the changes needed due to the C++11 ABI transition, bugs and problems fixed that had no report, and general rejuvenation or improvement of some parts of the code).

How long this will last, I cannot know. I hope to find a job at some point, which obviously will reduce the time available to work on this.

But in the meantime, for all aptitude users: Enjoy the fixes and new features!


[1] ^ Some visitors of the recent mini-DebConf @ Cambridge perhaps thought that the fireworks and throngs gathered were in honour of our mighty Universal Operating System, but sadly they were not. They might be, some day. In any case, the reports say that the visitors enjoyed the fireworks.

Catégories: Elsewhere

Carl Chenet: Retweet 0.5 : only retweet some tweets

Planet Debian - lun, 16/11/2015 - 00:00

Retweet 0.5 is now available ! The main new feature is: Retweet now lets you define a list of hashtags that, if they appear in the text of the tweet, this tweet is not retweeted.

You only need this line in your retweet.ini configuration file:


Have a look at the official documentation to read how it extensively works.

Retweet 0.5 is available on the PyPI repository and is already in the official Debian unstable repository.

Retweet is in production already for Le Journal Du hacker , a French FOSS community website to share and relay news and LinuxJobs.fr , a job board for the French-speaking FOSS community.

What about you? does Retweet allow you to develop your Twitter account? Let your comments in this article.

Catégories: Elsewhere

Dirk Eddelbuettel: Rcpp 0.12.2: More refinements

Planet Debian - dim, 15/11/2015 - 21:45

The second update in the 0.12.* series of Rcpp is now on the CRAN network for GNU R. As usual, I will also push a Debian package. This follows the 0.12.0 release from late July which started to add some serious new features, and builds upon the 0.12.1 release in September. It also marks the sixth release this year where we managed to keep a steady bi-montly release frequency.

Rcpp has become the most popular way of enhancing GNU R with C or C++ code. As of today, 512 packages on CRAN depend on Rcpp for making analytical code go faster and further. That is up by more than fifty package from the last release in September (and we recently blogged about crossing 500 dependents).

This release once again features pull requests from two new contributors with Nathan Russell and Tianqi Chen joining in. As shown below, other recent contributors (such as such as Dan) are keeping at it too. Keep'em coming! Luke Tierney also email about a code smell he spotted and which we took care of. A big Thank You! to everybody helping with code, bug reports or documentation. See below for a detailed list of changes extracted from the NEWS file.

Changes in Rcpp version 0.12.2 (2015-11-14)
  • Changes in Rcpp API:

    • Correct return type in product of matrix dimensions (PR #374 by Florian)

    • Before creating a single String object from a SEXP, ensure that it is from a vector of length one (PR #376 by Dirk, fixing #375).

    • No longer use STRING_ELT as a left-hand side, thanks to a heads-up by Luke Tierney (PR #378 by Dirk, fixing #377).

    • Rcpp Module objects are now checked more carefully (PR #381 by Tianqi, fixing #380)

    • An overflow in Matrix column indexing was corrected (PR #390 by Qiang, fixing a bug reported by Allessandro on the list)

    • Nullable types can now be assigned R_NilValue in function signatures. (PR #395 by Dan, fixing issue #394)

    • operator<<() now always shows decimal points (PR #396 by Dan)

    • Matrix classes now have a transpose() function (PR #397 by Dirk fixing #383)

    • operator<<() for complex types was added (PRs #398 by Qiang and #399 by Dirk, fixing #187)

  • Changes in Rcpp Attributes:

    • Enable export of C++ interface for functions that return void.

  • Changes in Rcpp Sugar:

    • Added new Sugar function cummin(), cummax(), cumprod() (PR #389 by Nathan Russell fixing #388)

    • Enabled sugar math operations for subsets; e.g. x[y] + x[z]. (PR #393 by Kevin and Qiang, implementing #392)

  • Changes in Rcpp Documentation:

    • The NEWS file now links to GitHub issue tickets and pull requests.

    • The Rcpp.bib file with bibliographic references was updated.

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Catégories: Elsewhere

Lunar: Reproducible builds: week 29 in Stretch cycle

Planet Debian - dim, 15/11/2015 - 18:51

What happened in the reproducible builds effort this week:

Toolchain fixes

Emmanuel Bourg uploaded eigenbase-resgen/ which uses of the scm-safe comment style by default to make them deterministic.

Mattia Rizzolo started a new thread on debian-devel to ask a wider audience for issues about the -Wdate-time compile time flag. When enabled, GCC and clang print warnings when __DATE__, __TIME__, or __TIMESTAMP__ are used. Having the flag set by default would prompt maintainers to remove these source of unreproducibility from the sources.

Packages fixed

The following packages have become reproducible due to changes in their build dependencies: bmake, cyrus-imapd-2.4, drobo-utils, eigenbase-farrago, fhist, fstrcmp, git-dpm, intercal, libexplain, libtemplates-parser, mcl, openimageio, pcal, powstatd, ruby-aggregate, ruby-archive-tar-minitar, ruby-bert, ruby-dbd-odbc, ruby-dbd-pg, ruby-extendmatrix, ruby-rack-mobile-detect, ruby-remcached, ruby-stomp, ruby-test-declarative, ruby-wirble, vtprint.

The following packages became reproducible after getting fixed:

Some uploads fixed some reproducibility issues, but not all of them:

Patches submitted which have not made their way to the archive yet:

  • #804729 on pbuilder by Reiner Herrmann: tell dblatex to build in a deterministic path.

The fifth and sixth armhf build nodes have been set up, resulting in five more builder jobs for armhf. More than 10,000 packages have now been identified as reproducible with the “reproducible” toolchain on armhf. (Vagrant Cascadian, h01ger)

Helmut Grohne and Mattia Rizzolo now have root access on all 12 build nodes used by reproducible.debian.net and jenkins.debian.net. (h01ger)

reproducible-builds.org is now linked from all package pages and the reproducible.debian.net dashboard. (h01ger)

profitbricks-build5-amd64 and profitbricks-build6-amd64, responsible for running amd64 tests now run 398.26 days in the future. This means that one of the two builds that are being compared will be run on a different minute, hour, day, month, and year. This is not yet the case for armhf. FreeBSD tests are also done with 398.26 days difference. (h01ger)

The design of the Arch Linux test page has been greatly improved. (Levente Polyak)

diffoscope development

Three releases of diffoscope happened this week numbered 39 to 41. It includes support for EPUB files (Reiner Herrmann) and Free Pascal unit files, usually having .ppu as extension (Paul Gevers).

The rest of the changes were mostly targetting at making it easier to run diffoscope on other systems. The tlsh, rpm, and debian modules are now all optional. The test suite will properly skip tests that need optional tools or modules when they are not available. As a result, diffosope is now available on PyPI and thanks to the work of Levente Polyak in Arch Linux.

Getting these versions in Debian was a bit cumbersome. Version 39 was uploaded with an expired key (according to the keyring on ftp.debian.org which will hopefully be updated soon) which is currently handled by keeping the files in the queue without REJECTing them. This prevented any other Debian Developpers to upload the same version. Version 40 was uploaded as a source-only upload… but failed to build from source which had the undesirable side effect of removing the previous version from unstable. The package faild to build from source because it was built passing -I to debbuild. This excluded the ELF object files and static archives used by the test suite from the archive, preventing the test suite to work correctly. Hopefully, in a nearby future it will be possible to implement a sanity check to prevent such mistakes in the future.

It has also been identified that ppudump outputs time in the system timezone without considering the TZ environment variable. Zachary Vance and Paul Gevers raised the issue on the appropriate channels.

strip-nondeterminism development

Chris Lamb released strip-nondeterminism version 0.014-1 which disables stripping Mono binaries as it is too aggressive and the source of the problem is being worked on by Mono upstream.

Package reviews

133 reviews have been removed, 115 added and 103 updated this week.

Chris West and Chris Lamb reported 57 new FTBFS bugs.


The video of h01ger and Chris Lamb's talk at MiniDebConf Cambridge is now available.

h01ger gave a talk at CCC Hamburg on November 13th, which was well received and sparked some interest among Gentoo folks. Slides and video should be available shortly.

Frederick Kautz has started to revive Dhiru Kholia's work on testing Fedora packages.

Your editor wish to once again thank #debian-reproducible regulars for reviewing these reports weeks after weeks.

Catégories: Elsewhere

Simon McVittie: Discworld Noir in a Windows 98 virtual machine on Linux

Planet Debian - dim, 15/11/2015 - 18:19

Discworld Noir was a superb adventure game, but is also notoriously unreliable, even in Windows on real hardware; using Wine is just not going to work. After many attempts at bringing it back into working order, I've settled on an approach that seems to work: now that qemu and libvirt have made virtualization and emulation easy, I can run it in a version of Windows that was current at the time of its release. Unfortunately, Windows 98 doesn't virtualize particularly well either, so this still became a relatively extensive yak-shaving exercise.

These instructions assume that /srv/virt is a suitable place to put disk images, but you can use anywhere you want.

The emulated PC

After some trial and error, it seems to work if I configure qemu to emulate the following:

  • Fully emulated hardware instead of virtualization (qemu-system-i386 -no-kvm)
  • Intel Pentium III
  • Intel i440fx-based motherboard with ACPI
  • Real-time clock in local time
  • No HPET
  • 256 MiB RAM
  • IDE primary master: IDE hard disk (I used 30 GiB, which is massively overkill for this game; qemu can use sparse files so it actually ends up less than 2 GiB on the host system)
  • IDE primary slave, secondary master, secondary slave: three CD-ROM drives
  • PS/2 keyboard and mouse
  • Realtek AC97 sound card
  • Cirrus video card with 16 MiB video RAM

A modern laptop CPU is an order of magnitude faster than what Discworld Noir needs, so full emulation isn't a problem, despite being inefficient.

There is deliberately no networking, because Discworld Noir doesn't need it, and a 17 year old operating system with no privilege separation is very much not safe to use on the modern Internet!

Software needed
  • Windows 98 installation CD-ROM as a .iso file (cp /dev/cdrom windows98.iso) - in theory you could also use a real optical drive, but my laptop doesn't usually have one of those. I used the OEM disc, version 4.10.1998 (that's the original Windows 98, not the Second Edition), which came with a long-dead PC, and didn't bother to apply any patches.
  • A Windows 98 license key. Again, I used an OEM key from a past PC.
  • A complete set of Discworld Noir (English) CD-ROMs as .iso files. I used the UK "Sold Out Software" budget release, on 3 CDs.
  • A multi-platform Realtek AC97 audio driver.
Windows 98 installation

It seems to be easiest to do this bit by running qemu-system-i386 manually:

qemu-img create -f qcow2 /srv/virt/discworldnoir.qcow2 30G qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \ -drive media=cdrom,format=raw,file=/srv/virt/windows98.iso \ -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime

Don't start the installation immediately. Instead, boot the installation CD to a DOS prompt with CD-ROM support. From here, run


and create a single partition filling the emulated hard disk. When finished, hard-reboot the virtual machine (press Ctrl+C on the qemu-system-i386 process and run it again).

The DOS FORMAT.COM utility is on the Windows CD-ROM but not in the root directory or the default %PATH%, so you'll have to run:

d:\win98\format c:

to create the FAT filesystem. You might have to reboot again at this point.

The reason for doing this the hard way is that the Windows 98 installer doesn't detect qemu as supporting ACPI. You want ACPI support, so that Windows will issue IDLE instructions from its idle loop, instead of occupying a CPU core with a busy-loop. To get that, boot to a DOS prompt again, and use:

setup /p j /iv

/p j forces ACPI support (Thanks to "Richard S" on the Virtualbox forums for this tip.) /iv is unimportant, but it disables the annoying "billboards" during installation, which advertised once-exciting features like support for dial-up modems and JPEG wallpaper.

I used a "Typical" installation; there didn't seem to be much point in tweaking the installed package set when everything is so small by modern standards.

Windows 98 has built-in support for the Cirrus VGA card that we're emulating, so after a few reboots, it should be able to run in a semi-modern resolution and colour depth. Discworld Noir apparently prefers a 640 × 480 × 16-bit video mode, so right-click on the desktop background, choose Properties and set that up.

Audio drivers

This is the part that took me the longest to get working. Of the sound cards that qemu can emulate, Windows 98 only supports the SoundBlaster 16 out of the box. Unfortunately, the Soundblaster 16 emulation in qemu is incomplete, and in particular version 2.1 (as shipped in Debian 8) has a tendency to make Windows lock up during boot.

I've seen advice in various places to emulate an Eqsonic ES1370 (SoundBlaster AWE 64), but that didn't work for me: one of the drivers I tried caused Windows to lock up at a black screen during boot, and the other didn't detect the emulated hardware.

The next-oldest sound card that qemu can emulate is a Realtek AC97, which was often found integrated into motherboards in the late 1990s. This one seems to work, with the "A400" driver bundle linked above. For Windows 98 first edition, you need a driver bundle that includes the old "VXD" drivers, not just the "WDM" drivers supported by Second Edition and newer.

The easiest way to get that into qemu seems to be to turn it into a CD image:

genisoimage -o /srv/virt/discworldnoir-drivers.iso WDM_A400.exe qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \ -drive media=cdrom,format=raw,file=/srv/virt/windows98.iso \ -drive media=cdrom,format=raw,file=/srv/virt/discworldnoir-drivers.iso \ -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime -soundhw ac97

Run the installer from E:, then reboot with the Windows 98 CD inserted, and Windows should install the driver.

Installing Discworld Noir

Boot up the virtual machine with CD 1 in the emulated drive:

qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \ -drive media=cdrom,format=raw,file=/srv/virt/DWN_ENG_1.iso \ -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime -soundhw ac97

You might be thinking "... why not insert all three CDs into D:, E: and F:?" but the installer expects subsequent disks to appear in the same drive where CD 1 was initially, so that won't work. Instead, when prompted for a new CD, switch to the qemu monitor with Ctrl+Alt+2 (note that this is 2, not F2). At the (qemu) prompt, use info block to see a list of emulated drives, then issue a command like

change ide0-cd1 /srv/virt/DWN_ENG_2.iso

to swap the CD. Then switch back to Windows' console with Ctrl+Alt+1 and continue installation. I used a Full installation of Discworld Noir.

Transferring the virtual machine to GNOME Boxes

Having finished the "control freak" phase of installation, I wanted a slightly more user-friendly way to run this game, so I transferred the virtual machine to be used by libvirtd, which is the backend for both GNOME Boxes and virt-manager:

virsh create discworldnoir.xml

Here is the configuration I used. It's a mixture of automatic configuration from virt-manager, and hand-edited configuration to make it match the qemu-system-i386 command-line.

Running the game

If all goes well, you should now see a discworldnoir virtual machine in GNOME Boxes, in which you can boot Windows 98 and play the game. Have fun!

Catégories: Elsewhere

Daniel Pocock: Migrating data from Windows phones

Planet Debian - sam, 14/11/2015 - 22:18

Many of the people who have bought Windows phones seek relief sooner or later. Sometimes this comes about due to peer pressure or the feeling of isolation, in other cases it is the frustration of the user interface or the realization that they can't run cool apps like Lumicall.

Frequently, the user has been given the phone as a complimentary upgrade when extending a contract without perceiving the time, effort and potential cost involved in getting their data out of the phone, especially if they never owned a smartphone before.

When a Windows phone user does decide to cut their losses, they are usually looking to a friend or colleague with technical expertise to help them out. Personally, I'm not sure that anybody I would regard as an IT expert has ever had a Windows phone though, meaning that many experts are probably also going to be scratching their heads when somebody asks them for help. Therefore, I've put together this brief guide to help deal with these phones more expediently when they are encountered.

The Windows phones have really bad support for things like CalDAV and WebDAV so don't get your hopes up about using such methods to backup the data to any arbitrary server. Searching online you can find some hacks that involve creating a Google or iCloud account in the phone and then modifying the advanced settings to send the data to an arbitrary server. These techniques vary a lot between specific versions of the Windows Phone OS and so the techniques I've described below are probably easier.

Identify the Windows Live / Hotmail account

The user may not remember or realize that a Microsoft account was created when they first obtained the phone. It may have been created for them by the phone, a friend or the salesperson in the phone shop.

Look in the settings (Accounts) to find the account ID / email address. If the user hasn't been using this account, they may not recognize it and probably won't know the password for it. It is essential to try and obtain (or reset) the password before going any further, so start with the password recovery process. Microsoft may insist on sending a password reset email to some other email address that the user has previously provided or linked to their phone.

Extracting data from the phone

In many cases, the easiest way to extract the data is to download it from Microsoft live.com rather than extracting it from the phone. Even if the user doesn't realize it, the data is probably all replicated in live.com and so there is no further loss of privacy by logging in there to extract it.

Set up an IMAP mail client

An IMAP client will be used to download the user's emails (from the live.com account they may never have used) and SMS.

Install Mozilla Thunderbird (IceDove on Debian), GNOME Evolution or a similar program on the user's PC.

Configure the IMAP mail client to connect to the live.com account. Some clients, like Thunderbird, will automatically set up all the server details when you enter the live.com account ID. For manual account setup, the details here may help.

Email backup

If the user was not using the live.com account ID for email correspondence, there may not be a lot of mail in it. There may be some billing receipts or other things that are worth keeping though.

Create a new folder (or set of folders) in the user's preferred email account and drag and drop the messages from the live.com Inbox to the new folder(s).

SMS backup

SMS backup can also be done through live.com. It is slightly more complicated than email backup, but similar.

  • In the live.com Outlook email index page, look for the settings button and click Manage Categories.
  • Enable the Contacts and Photos categories with a tick in each of them.
  • Go back to the main Inbox page and look for the categories section on the bottom left-hand side of the screen, under the folder list. Click the Contacts category.
  • The page may now appear blank. That is normal.
  • On the top right-hand corner of the page, click the Arrange menu and choose Conversation.
  • All the SMS messages should now appear on the screen.
  • Under the mail folders list on the left-hand side of the page, click to create a new folder with a name like SMS.
  • Select all the SMS messages and look for the option to move them to a folder. Send them to the SMS folder you created.
  • Now use the IMAP mail client to locate the SMS folder and copy everything from there to a new folder in the user's preferred mail server or local disk.
Contacts backup

On the top left-hand corner of the live.com email page, there is a chooser to select other live.com applications. Select People.

You should now see a list of all the user's contacts. Look for the option to export them to Outlook and other programs. This will export them as a CSV file.

You can now import the CSV file into another application. GNOME Evolution has an import wizard with an option for Outlook file format. To load the contacts into a WebDAV address book, such as DAViCal, configure the address book in Evolution and then select it as the destination when running the CSV import wizard.

WARNING: beware of using the Mozilla Thunderbird address book with contact data from mobile devices and other sources. It can't handle more than two email addresses per contact and this can lead to silent data loss if contacts are not fully saved.

Calendar backup

Now go to the live.com application chooser again and select the calendar application. Microsoft provides instructions to extract the calendar, summarised here:

  • Look for the Share button at the top somewhere and click it.
  • On the left-hand side of the page, click Get a link
  • On the right-hand side, choose Show event details to ensure you get a full calendar and then click Create underneath it.
  • Look for the link with a webcals prefix. If you are downloading with a tool like wget, change the scheme prefix to https. Fetch the file from this link and save it with an ics extension.
  • Inspect the ics calendar file to make sure it looks like real iCalendar data.

You can now import the ics file into another application. GNOME Evolution has an import wizard with an option for iCalendar file format. To load the calendar entries into a CalDAV server, such as DAViCal, configure the calendar server in Evolution and then select it as the destination when running the import wizard.

Backup the user's photos, videos and other data files

Hopefully you will be able to do this step without going through live.com. Try enabling the MTP or PTP mode in the phone and attach it to the computer using the USB cable. Hopefully the computer will recognize it in at least one of those modes.

Use the computer's file manager or another tool to simply backup the entire directory structure.

Reset the phone to factory defaults

Once the user has their hands on a real phone, it is likely they will never want to look at that Windows phone again. It is time to erase the Windows phone, there is no going back.

Go to the Settings and About and tap the factory reset option. It is important to do this before obliterating the live.com account, otherwise there are scenarios where you could be locked out of the phone and unable to erase it.

Erasing may take some time. The phone will reboot and then display an animation of some gears spinning around for a few minutes and then reboot again. Wait for it to completely erase.

Permanently close the Microsoft live.com account

Keeping track of multiple accounts and other services is tedious and frustrating for most people, especially with services that try to force the user to receive email in different places.

You can help eliminate user fatigue by helping them permanently close the live.com account so they never have to worry about it again.

Follow the instructions on the Microsoft site.

At some point it will suggest certain actions you should take before closing the account, most can be ignored. One thing you should do is remove the link between the live.com account ID and the phone. It is a good idea to do this as otherwise you may have problems erasing the device, if you haven't already done so. Before completely closing the account, also verify that the factory reset of the phone completed successfully.

Dispose of the Windows phone safely

If you can identify any faults with the phone, the user may be able to return it under the terms of the warranty. Some phone companies may allow the user to exchange it for something more desirable when it fails under warranty.

It may be tempting to sell the phone to a complete stranger on eBay or install a custom ROM on it. In practice, neither option may be worth the time and effort involved. You may be tempted to put it beyond use so nobody else will suffer with it, but please try to do so in a way that is respectful of the environment.

Putting the data into a new phone

Prepare the new phone with a suitable ROM such as Replicant or Cyanogenmod.

Install the F-Droid app on the new phone.

From F-droid, install the DAVdroid app. DAVdroid will allow you to quickly sync the new phone against any arbitrary CalDAV and WebDAV server to populate it with the user's calendar and contact / address book data.

Now is a good time to install other interesting apps like Lumicall, Conversations and K-9 Mail.

Catégories: Elsewhere

Paul Johnson: Tell me your Celebr8D8 plans

Planet Drupal - sam, 14/11/2015 - 12:40

I am spearheading the Drupal 8 release celebrations on social media Thursday 19th November. Perhaps, like me, you have been working behind the scenes on a personal project to mark this significant occasion. If you have a website, special party, publicity stunt planned I'm keen to know about it. Come the big day I will use Drupal's social media and @Celebr8D8 to tell the world about your event, site, stunt.

Use my Drupal.org contact form to let me know about your release day plans. Thanks!

Catégories: Elsewhere

agoradesign: Including image styles in your module or theme in Drupal 8

Planet Drupal - sam, 14/11/2015 - 11:29
Today I have a small practical tipp for those, who want to include a custom image style in their module or theme in Drupal 8, including a best practice proposal.
Catégories: Elsewhere

Juliana Louback: PaperTrail - Powered by IBM Watson

Planet Debian - sam, 14/11/2015 - 11:06

On the final semester of my MSc program at Columbia SEAS, I was lucky enough to be able to attend a seminar course taught by Alfio Gliozzo entitled Q&A with IBM Watson. A significant part of the course is dedicated to learning how to leverage the services and resources available on the Watson Developer Cloud. This post describes the course project my team developed, the PaperTrail application.

Project Proposal

Create an application to assist in the development of future academic papers. Based on a paper’s initial proposal, Paper Trail predicts publications to be used as references or acknowledgement of prior art and provides a trend analysis of major topics and methods.

The objective is to speed the discovery of relevant papers early in the research process, and allow for early assessment of the depth of prior research concerning the initial proposal.

Meet the Team

Wesley Bruning, Software Engineer, MSc. in Computer Science

Xavier Gonzalez, Industrial Engineer, MSc. in Data Science

Juliana Louback, Software Engineer, MSc. in Computer Science

Aaron Zakem, Patent Attorney, MSc. in Computer Science

Prior Art

A significant amount of attention has been given to this topic over the past few decades. The table below shows the work the team deemed most relevant due to recency, accuracy and similarity of functionality.

The variation in accuracy displayed is a result of experimentation with different dataset sizes and algorithm variations. More information and details can be found in the prior art report.

The main differential of PaperTrail is providing a form of access to the citation prediciton and trend analysis algorithm. With the exception of the project by McNee et al., these algorithmns aren’t currently available for general use. The application on researchindex.net is open to use but its objective is to rank publications and authors for given topics.


Citation Prediction: PaperTrail builds on the work done by Wolski’s team in Fall 2014. This algorithmn builds a reference graph used to define research communities, with an associated vector of topic scores generated by an LDA model. The papers in each research community are then ranked by importance within the community with a custom ranking algorithm. When a target document is given to algorithm as input, the LDA model is used to generate a vector of topics that are present in the document. The communities with the most similar topic vectors are selected and the publications within these communities with highest rank and greatest similarity to the input document are recommended as references. A more detailed description can be found here.

Trend Analysis: Initially, the idea was to use the AlchemyData News API to obtain statistics pertaining to the amount of publications on a given topic over time. However, with the exception of buzz-words (i.e. ‘big data’), many more specialized topics appeared very infrequently in news articles, if at all. This isn’t entirely surprising given the target audience of PaperTrail. As a work around, we use the Alchemy Language API to extract keywords from the abstracts in the dataset, in addition to relevance scores. The PaperTrail database could then be queried for entry counts for a given year and keyword to provide an indication of publication trends in academia. Note that the Alchemy Language API extracts multiple-word ‘keywords’ as well as single words.


To maintain consistency with Wolski’s project, we are using the DBLP data as made available on aminer.org. The DBLP-Citation-network V5 dataset contains 1,572,277 entries; we are limited to the use of entries that contain both abstracts and citations, bringing the dataset size down to 265,865 entries.


A high-level visualization of the project architecture is displayed below. Before launching PaperTrail, it’s necessary to train Wolski’s algorithm offline. Currently any documentation with regard to the performance of said algorithm is unavailable; the PaperTrail project will include an evaluation phase and report the findings made.

The PaperTrail app and database will be hosted on the Bluemix Platform.

Status Report

Phases completed:

  • Project design

  • Prior art research

  • Data cleansing

  • Development and deployment of an alpha version of the PaperTrail app

Phases under development:

  • Algorithm training and evaluation

  • Keyword extraction

  • MapReduce of publication frequency by year and topic

  • Data visualization component

Catégories: Elsewhere

Craig Small: Mixing pysnmp and stdin

Planet Debian - sam, 14/11/2015 - 08:04

Depending on the application, sometimes you want to have some socket operations going (such as loading a website) and have stdin being read. There are plenty of examples for this in python which usually boil down to making stdin behave like a socket and mixing it into the list of sockets select() cares about.

A while ago I asked an email list could I have pysnmp use a different socket map so I could add my own sockets in (UDP, TCP and a zmq to name a few) and the Ilya the author of pysnmp explained how pysnmp can use a foreign socket map.

This sample code below is merely an mixture of Ilya’s example code and the way stdin gets mixed into the fold.  I have also updated to the high-level pysnmp API which explains the slight differences in the calls.

  1. from time import time
  2. import sys
  3. import asyncore
  4. from pysnmp.hlapi import asyncore as snmpAC
  5. from pysnmp.carrier.asynsock.dispatch import AsynsockDispatcher
  8. class CmdlineClient(asyncore.file_dispatcher):
  9. def handle_read(self):
  10. buf = self.recv(1024)
  11. print "you said {}".format(buf)
  14. def myCallback(snmpEngine, sendRequestHandle, errorIndication,
  15. errorStatus, errorIndex, varBinds, cbCtx):
  16. print "myCallback!!"
  17. if errorIndication:
  18. print(errorIndication)
  19. return
  20. if errorStatus:
  21. print('%s at %s' % (errorStatus.prettyPrint(),
  22. errorIndex and varBinds[int(errorIndex)-1] or '?')
  23. )
  24. return
  26. for oid, val in varBinds:
  27. if val is None:
  28. print(oid.prettyPrint())
  29. else:
  30. print('%s = %s' % (oid.prettyPrint(), val.prettyPrint()))
  32. sharedSocketMap = {}
  33. transportDispatcher = AsynsockDispatcher()
  34. transportDispatcher.setSocketMap(sharedSocketMap)
  35. snmpEngine = snmpAC.SnmpEngine()
  36. snmpEngine.registerTransportDispatcher(transportDispatcher)
  37. sharedSocketMap[sys.stdin] = CmdlineClient(sys.stdin)
  39. snmpAC.getCmd(
  40. snmpEngine,
  41. snmpAC.CommunityData('public'),
  42. snmpAC.UdpTransportTarget(('', 161)),
  43. snmpAC.ContextData(),
  44. snmpAC.ObjectType(
  45. snmpAC.ObjectIdentity('SNMPv2-MIB', 'sysDescr', 0)),
  46. cbFun=myCallback)
  48. while True:
  49. asyncore.poll(timeout=0.5, map=sharedSocketMap)
  50. if transportDispatcher.jobsArePending() or transportDispatcher.transportsAreWorking():
  51. transportDispatcher.handleTimerTick(time())

Some interesting lines from the above code:

  • Lines 8-11 are the stdin class that is called (or rather its handle_read method is) when there is text available on stdin.
  • Line 34 is where pysnmp is told to use our socket map and not its inbuilt one
  • Line 37 is where we have used the socket map to say if we get input from stdin, what is the handler.
  • Lines 39-46 are sending a SNMP query using the high-level API
  • Lines 48-51 are my simple socket poller

With all this I can handle keyboard presses and network traffic, such as a simple SNMP poll.

Catégories: Elsewhere

ActiveLAMP: Visual Regression Testing with Shoov.io

Planet Drupal - sam, 14/11/2015 - 03:00

Shoov.io is a nifty website testing tool created by Gizra. We at ActiveLAMP were first introduced to Shoov.io at DrupalCon LA, in fact, Shoov.io is built on, you guessed it, Drupal 7 and it is an open source visual regression toolkit.

Catégories: Elsewhere

Drupal core announcements: Drupal core security release window on Wednesday, November 18

Planet Drupal - ven, 13/11/2015 - 22:24
Start:  2015-11-18 (All day) America/New_York Organizers:  David_Rothstein Event type:  Online meeting (eg. IRC meeting)

The monthly security release window for Drupal 6 and Drupal 7 core will take place on Wednesday, November 18.

This does not mean that a Drupal core security release will necessarily take place on that date for either the Drupal 6 or Drupal 7 branches, only that you should prepare to look out for one (and be ready to update your Drupal sites in the event that the Drupal security team decides to make a release).

There will be no bug fix/feature release on this date; the next window for a Drupal core bug fix/feature release is Wednesday, December 2 (and before that, on November 19, Drupal 8.0.0 is scheduled to be released).

For more information on Drupal core release windows, see the documentation on release timing and security releases, and the discussion that led to this policy being implemented.

Catégories: Elsewhere

Francois Marier: How Tracking Protection works in Firefox

Planet Debian - ven, 13/11/2015 - 21:40

Firefox 42, which was released last week, introduced a new feature in its Private Browsing mode: tracking protection.

If you are interested in how this list is put together and then used in Firefox, this post is for you.

Safe Browsing lists

There are many possible ways to download URL lists to the browser and check against that list before loading anything. One of those is already implemented as part of our malware and phishing protection. It uses the Safe Browsing v2.2 protocol.

In a nutshell, the way that this works is that each URL on the block list is hashed (using SHA-256) and then that list of hashes is downloaded by Firefox and stored into a data structure on disk:

  • ~/.cache/mozilla/firefox/XXXX/safebrowsing/mozstd-track* on Linux
  • ~/Library/Caches/Firefox/Profiles/XXXX/safebrowsing/mozstd-track* on Mac
  • C:\Users\XXXX\AppData\Local\mozilla\firefox\profiles\XXXX\safebrowsing\mozstd-track* on Windows

This sbdbdump script can be used to extract the hashes contained in these files and will output something like this:

$ ~/sbdbdump/dump.py -v . - Reading sbstore: mozstd-track-digest256 [mozstd-track-digest256] magic 1231AF3B Version 3 NumAddChunk: 1 NumSubChunk: 0 NumAddPrefix: 0 NumSubPrefix: 0 NumAddComplete: 1696 NumSubComplete: 0 [mozstd-track-digest256] AddChunks: 1445465225 [mozstd-track-digest256] SubChunks: ... [mozstd-track-digest256] addComplete[chunk:1445465225] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5 ... [mozstd-track-digest256] MD5: 81a8becb0903de19351427b24921a772

The name of the blocklist being dumped here (mozstd-track-digest256) is set in the urlclassifier.trackingTable preference which you can find in about:config. The most important part of the output shown above is the addComplete line which contains a hash that we will see again in a later section.

List lookups

Once it's time to load a resource, Firefox hashes the URL, as well as a few variations of it, and then looks for it in the local lists.

If there's no match, then the load proceeds. If there's a match, then we do an additional check against a pairwise allowlist.

The pairwise allowlist (hardcoded in the urlclassifier.trackingWhitelistTable pref) is designed to encode what we call "entity relationships". The list groups related domains together for the purpose of checking whether a load is first or third party (e.g. twitter.com and twimg.com both belong to the same entity).

Entries on this list (named mozstd-trackwhite-digest256) look like this:


which translates to "if you're on the twitter.com site, then don't block resources from twimg.com.

If there's a match on the second list, we don't block the load. It's only when we get a match on the first list and not the second one that we go ahead and cancel the network load.

If you visit our test page, you will see tracking protection in action with a shield icon in the URL bar. Opening the developer tool console will expose the URL of the resource that was blocked:

The resource at "https://trackertest.org/tracker.js" was blocked because tracking protection is enabled.

Creating the lists

The blocklist is created by Disconnect according to their definition of tracking.

The Disconnect list is on their Github page, but the copy we use in Firefox is the copy we have in our own repository. Similarly the Disconnect entity list is from here but our copy is in our repository. Should you wish to be notified of any changes to the lists, you can simply subscribe to this Atom feed.

To convert this JSON-formatted list into the binary format needed by the Safe Browsing code, we run a custom list generation script whenever the list changes on GitHub.

If you run that script locally using the same configuration as our server stack, you can see the conversion from the original list to the binary hashes.

Here's a sample entry from the mozstd-track-digest256.log file:

[m] twimg.com >> twimg.com/ [canonicalized] twimg.com/ [hash] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5

and one from mozstd-trackwhite-digest256.log:

[entity] Twitter >> (canonicalized) twitter.com/?resource=twimg.com, hash a8e9e3456f46dbe49551c7da3860f64393d8f9d96f42b5ae86927722467577df

This in combination with the sbdbdump script mentioned earlier, will allow you to audit the contents of the local lists.

Serving the lists

The way that the binary lists are served to Firefox is through a custom server component written by Mozilla: shavar.

Every hour, Firefox requests updates from shavar.services.mozilla.com. If new data is available, then the whole list is downloaded again. Otherwise, all it receives in return is an empty 204 response.

Should you want to play with it and run your own server, follow the installation instructions and then go into about:config to change these preferences to point to your own instance:

browser.trackingprotection.gethashURL browser.trackingprotection.updateURL

Note that on Firefox 43 and later, these prefs have been renamed to:

browser.safebrowsing.provider.mozilla.gethashURL browser.safebrowsing.provider.mozilla.updateURL Learn more

If you want to learn more about how tracking protection works in Firefox, you can find all of the technical details on the Mozilla wiki or you can ask questions on our mailing list.

Thanks to Tanvi Vyas for reviewing a draft of this post.

Catégories: Elsewhere

Drupal core announcements: CHANGELOG.txt is being updated to prepare for D8 release. What changes to you want highlighted?

Planet Drupal - ven, 13/11/2015 - 20:52

CHANGELOG.txt is being updated to prepare for D8 release. What changes to you want highlighted?

See issue: https://www.drupal.org/node/2606334

Catégories: Elsewhere

John Goerzen: Memories of a printer

Planet Debian - ven, 13/11/2015 - 19:42

I have a friend who hates printers. I’ll call him “Mark”, because that, incidentally, is his name. His hatred for printers is partly my fault, but that is, ahem, a story for another time that involves him returning from a battle with a printer with a combination of weld dust, toner, and a deep scowl on his face.

I also tend to hate printers. Driver issues, crinkled paper, toner spilling all over the place…. everybody hates printers.

But there is exactly one printer that I have never hated. It’s almost 20 years old, and has some stories to tell.

Nearly 20 years ago, I was about to move out of my parents’ house, and I needed a printer. I bought a LaserJet 6MP. This printer ought to have been made by Nokia. It’s still running fine, 18 years later. It turned out to be one of the best investments in computing equipment I’ve ever made. Its operating costs, by now, are cheaper than just about any printer you can buy today — less than one cent per page. It has been supported by every major operating system for years.

PostScript was important, because back then running Ghostscript to convert to PCL was both slow and a little error-prone. PostScript meant I didn’t need a finicky lpr/lprng driver on my Linux workstation to print. It just… printed. (Hat tip to anyone else that remembers the trial and error of constructing an /etc/printcap that would print both ASCII and PostScript files correctly!)

Out of this printer have come plane and train tickets, taking me across the country to visit family and across the world to visit friends. It’s printed resumes and recipes, music and university papers. I even printed wedding invitations and envelopes on them two years ago, painstakingly typeset in LaTeX and TeXmacs. I remember standing at the printer in the basement one evening, feeding envelope after envelope into the manual feed slot. (OK, so it did choke on a couple of envelopes, but overall it all worked great.)

The problem, though, is that it needs a parallel port. I haven’t had a PC with one of those in a long while. A few years ago, in a moment of foresight, I bought a little converter box that has an Ethernet port and a parallel port, with the idea that it would be pay for itself by letting me not maintain some old PC just to print. Well, it did, but now the converter box is dying! And they don’t make them anymore. So I finally threw in the towel and bought a new LaserJet.

It cost a third of what the 6MP did, has a copier, scanner, prints in color, does duplexing, has wifi… and, yes, still supports PostScript — strangely enough, a deciding factor in going with HP over Brother once again. (The other was image quality)

We shall see if I am still using it when I’m 50.

Catégories: Elsewhere


Subscribe to jfhovinne agrégateur - Elsewhere