As mentioned by Raphaël Hertzog, I have been spending some time on improving Britney2. Just the other day I submitted a second branch for review that I expect to merge early next week. I also got another set patches coming up soon. Currently, none of them are really user visible, so unless you are hosting your own version of Britney, these patches are probably not all that interesting to you.
- Reduce the need for backtracking by finding semantically equivalent packages.
- Avoid needing to set up a backtrack point in some cases.
- This has the side-effect of eliminating some O(e^n) runtime cases.
- Optimise “installability” testing of packages affected by a hinted migration.
- This has the side-effect of avoiding some O(e^n) runtime cases when the “post-hint” state does not trigger said behaviour.
- There is a follow-up patch for this one coming in the third series to fix a possible bug for a corner-case (causing a valid hint to be incorrectly rejected when it removed an “uninstallable” package).
- Reduce the number of affected packages to test when migrating items by using knowledge about semantically equivalent packages.
- In some cases, Britney can now do “free” migrations when all binaries being updated replace semantically equivalent packages.
- (Merge pending) Avoid many redundant calls to “sort_actions()”, which exhibits at least O(n^2) runtime in some cases.
- For the dataset Raphaël submitted, this patch shaves off over 30 minutes runtime. In the particular case, each call to sort_actions takes 3+ minutes and it was called at least 10 times, where it was not needed.
- That said, sort_actions have a vastly lower runtime in the runs for Debian (and presumably also Ubuntu, since no one complained from their side so far).
The results so far:
After the first patch series was merged, the Kali dataset (from Raphaël) could be processed in “only” ~2 hours. With the second patch series merged, the dataset will drop by another 30-50 minutes (most of which are thanks to the change mentioned in highlight #5).
The third patch series currently do not have any mention-worthy performance related changes. It will probably be limited to bug fixes and some refactoring.
The 3 first highlights only affects the “new” installability tester meaning that the Britney2 instances at Ubuntu and Tanglu should be mostly unaffected by the O(n^2) runtime. Although those cases will probably just fail with several “AIEEE“s. :) The 5th highlight should equally interesting to all Britney2 instances though.
For me, the most interesting part is that we have never observed the O(n^2) behaviour in a daily “sid -> testing” run. The dataset from Raphaël was basically a “stable -> testing/sid” run, which is a case I do not think we have ever done before. Despite our current updates, there is still room for improvements on that particular use case.
In particular, I was a bit disheartened at how poorly our auto hinter(s) performed on this dataset. Combined they only assisted with the migration of something like 28 “items”. For comparison, the “main run” migrated ~7100 “items” and 9220 items were unable to migrate. Furthermore, the “Original” auto hinter spend the better part of 15 minutes computing hints – at least it results in 10 “items” migrating.
Links to the patches:
- First series (already merged)
- Second series (pending, to be merged Monday, the 11th of August)
- Third series (under development. Warning: the branch will be rebased if/as needed)
After some downtime due to the identi.ca changes, the Debian Twitter accounts are now back.
- Debian Bug: Lists all bugs reported - High traffic
- Debian Upload: Reports all new upload - High traffic
- Debian New: Reports packages in the NEW queue
- Debian Remove: Reports packages removed from the archive
New Twitter feed ideas are welcome.
I don't really like any of the ticketing systems I've ever needed to use, whether they've been used as bug tracking systems, user support issue management systems, or something else. Some are not too bad. I currently rely most on debbugs and ikiwiki.
debbugs is the Debian bug tracking system. See https://www.debian.org/Bugs/ for an entry point. It's mostly mail based, with a read-only web interface. You report a bug by sending an email to submission address, and (preferably) include a few magic "pseudo-headers" at the top of your message body ot identify the package and version. There's tools to make this easier, but mostly it's just about sending an e-mail. All replies are via e-mails as well. Effectively, each bug becomes is own little dedicated mailing list.
This is important. A ticket, whether it is a bug report or a support request, is all about the discussion. "Hey I have this problem..." followed by "Have you tried..." and so forth. Anything that makes that discussion easier and faster to have is better.
It is my very strong opinion, and long experience, that the best way to have such a discussion is over e-mail. A lot of modern ticketing systems are web based. They might have an e-mail mode, perhaps read-only, but that's mostly an afterthought. It's a thing bolted onto the side of the system because people like me whinge otherwise.
I like e-mail for this for several reasons.
E-mail is push, not pull. I don't need to go look at a web page to be notified that something's happened.
E-mail requires no extra usernames and passwords to manage. I don't need to create a new account every time I encounter a new ticketing system instance.
E-mail makes it very easy to respond. I can just reply to a message. I don't need to go to a web site, log in, and find a reply button.
I already have archives of my e-mail, so referring to old messages (or finding them) is easy and quick. (Mutt, offlineimap, and notmuch is my particular set of choices. But I'm not locked to them, and you can use whatever you like, too.)
E-mail is a very rich format. Discussions are inherently threaded, and various character sets, languages, attachments, and other such things just work.
For these reasons, I strongly prefer ticketing systems in which e-mails are the primary form of discussions, and e-mail is a first class citizen. I don't mind if there's other ways to participate in the discussion, but if I have to use something else than e-mail, I tend not to be happy.
I use ikiwiki to provide a distributed, shared notebook on bugs. It's a bit cumbersome, and doesn't work well for discussions.
I think we can improve on the way debbugs works, however. I've been thinking about ticketing systems for Obnam (my backup program), since it gaining enough users that it's getting hard to keep track of discussions with just an e-mail client.
Here's what I want:
Obnam users do not need to care about there being a ticketing system. They report a problem by e-mailing the support mailing list, and they keep the list in cc when conducting the discussion. This is very similar to debbugs, with the distinction that there's no ticket numbers that must be kept in the replies.
The support staff (that's me, but hopefully others as well) have access to the ticketing system, which automatically sorts incoming messages into tickets. Tickets have sufficient metadata that it's possible to track which ones have been dealt with, or still need work, and perhaps other things. Each ticket contain a Maildir with all the e-mails belonging to that ticket.
The ticketing system is distributed. I need to be able to work on tickets offline, and to synchronise instances between different computers. Just like git. It's not enough to have an offline mode (e.g., queuing e-mails on my laptop for sending to debbugs when I'm back online).
There is a reasonably powerful search engine that can quickly find the relevant tickets, and messages, based on various criteria.
I will eventually have this. I'm not saying I'm working on this, since I don't have enough free time to do that, but there's a git repository, and some code, and it imports e-mails automatically now.
Some day there may even be a web interface.
(This has been a teaser.)
Our top goal for the sprints is to make significant progress on the three remaining beta blocker issues. These issues aren't the best place to jump in if you're not already following them, but plach, alexpott, effulgentsia, fago, and others are going to do what they can to get these issues done.
Beta deadline issues
The next priority for the sprints are the beta deadline issues, which are non-critical issues that will have to be postponed to either Drupal 8.1.x or Drupal 9 if they are not done by the time the beta is ready. Many of these issues are related to the Entity Field API, so if you're interested in those systems, reach out to entity and field maintainers fago and swentel at Drupalaton to see if there's a beta deadline issue for you.
Twig autoescape followups and double-escaping bugs
One of the beta-blocking issues that's already been resolved is enabling Twig's autoescape functionality, so that strings that have not already been sanitized by Drupal can be escaped automatically in the theme layer. There are a lot of important followups to this change, which can be grouped into two categories:
Double-escaping issues (#2297711)
Since Drupal already does its own sanitization at many different points, there are a number of places where we are unintentionally escaping markup twice, resulting in double-escaping bugs like:
<em>My double-escaped string</em>
When code uses the appropriate sanitization functions or the theme and render systems so that the output can can be themed, escaped, and altered properly, double-escaping is not an issue. So, we need to fix these regressions, ideally by removing the markup from the code entirely and converting to a Twig template, or failing that, by using the inline templating render element. In some cases these issues might be simple to fix; in others they will require some refactoring.
Improper uses of SafeMarkup::set() (#2297703)
In order to inform the theme layer about what markup Drupal has already sanitized, strings that have been processed by t(), String::checkPlain() or Xss::filter() are automatically marked safe, as are markup strings created from render arrays via drupal_render(). This list of sanitized strings is stored by the SafeMarkup class, which is intended for internal use only. However, the initial conversion patch added SafeMarkup::set() calls in many places as an interim fix. We now need to remove as many of these improper uses of SafeMarkup as possible, by converting or refactoring the code in the same way that we would to fix double-escaping bugs.
We will be sprinting on these issues at TCDrupal. Talk to YesCT or mdrummond for help getting started.
- Double-escaping issues (#2297711)
Critical issue triage
Once Drupal 8 is in beta, the next step will be to resolve the other critical issues that block a Drupal 8 release candidate. As a first step, we need to assess all of the critical issues to determine which are most important, which are no longer relevant, etc., as well as what the path to get each done is. In each critical, we should clearly identify:
- Why is it critical?
- What would be the implications of not fixing the issue?
- What would be the implications of fixing the issue between betas? (Code changed for modules, upgrade path, etc.)
- What would be the implications of fixing the issue after the first release candidate?
- What is the next step to make progress? What are the remaining tasks?
Talk to xjm to help with this essential task.
If you're sprinting at TCDrupal, remember to put the TCDrupal 2014 issue tag on issues you work on at the sprint. Similarly, use the Drupalaton 2014 tag at Drupalaton. And whether you're sprinting in Minnesota, in Hungary, or remotely, join the #drupal-contribute IRC channel to coordinate with other sprinters.
This past week at Drupal Costa Rica, I had a nice conversation with Todd Ross Neinkerk of Four Kitchens, who was there presenting on the notion of de-coupling content management and content display (here's video of a similar talk he did in Austin). I also spoke with Jesus Olivias who recently did a great Spanish-language podcast with Omar Aguierre on the topic, and he was kind enough to give me his two cents.
Headless Drupal is officially now "a thing". It's all happening. If you're curious why this is exciting people, see my previous blog post on the topic: what's the big deal with headless websites? In this blog post I will dig into the technologies at your disposal for exploring Headless Drupal today.
Headless Drupal Now!
For those looking to develop Headless Drupal websites right now, you can totally do it with version 7. Even though there's excitement about the upcoming Drupal 8 release — and I'll detail the action below — you don't need to wait to get started with these techniques. Drupal 7 still has a long life ahead of it, and with the right contrib modules it is usable for anyone looking to build headless websites today.
The most well-known interface for Drupal 7 and an alternate front-end is the Services module, which has a very Drupal-ish manner (e.g. hook_services_alter()) of exposing various interfaces. It comes with built-in REST and XML-RPC interfaces, and allows you to expose, nodes, users, taxonomies and other core data fairly easily behind custom endpoints (API paths). You can also use it as a basis for specifying your own custom services.
There's also the restWS module, which exposes any Drupal entity on its existing URL based on headers. This module is the basis for Drupal 8's REST module, which we'll discuss more later.
Finally there's a really interesting package from the developers at Gizra, the Restful module, which is also entity-centric, but takes a different philosophical approach. Rather than exposing Drupal's internals, it allows developers to define what data they specifically want sent in response to a request. It also allows the exposure of some entity types and not others (e.g. the "Article" nodes, but not "Pages"). This module is definitely more developer-centric, but they have some nice blog posts about how they use it with AngularJS that will help you get up to speed.The Future of Headless Drupal in Version 8
The future of Headless Drupal opens up significantly with version 8. Core includes both a REST interface module and a brand new routing system built on the Symphony2 HTTP kernel. This provides a lot of opportunity for headless implementations both for beginning and more advanced developers.
The REST module is a souped-up version of what you got from RestWS in Drupal 7. Your core entities are all eligible for exposure, using the JSON+HAL format by default. This gives consumers of entity data the ability to follow "links" to other data sources — for instance you can pull the definition of a content type from any node.
Making Drupal's native entity data model accessible to other apps via REST takes only a few clicks. Views — also in core for Drupal 8 — natively supports "REST export" as a type of display. You can configure your way to a robust REST API into your content without installing a single extra module.
For those looking for more specific or nuanced functionality, the core HTTP routing framework is one of the most exciting pieces. It's a general upgrade for how all Drupal modules handle requests, replacing the legendary hook_menu() with a fully-featured HTTP server. You can set up custom routes, define controllers for callbacks, and manage responses based on headers, status codes, and all the other things one cares about once you make the mental leap from "serving pages" to "talking HTTP" in your application.
For developers with experience building server-side applications in Python, Ruby on Rails, or Node, this is a welcome change. It opens the door to much more sophisticated implementations with Drupal — powering the backend for complex mobile applications, serving as a lightweight integration point for different kinds of data, even acting as a pure API to external application developers.Much More To Come
There's still more to come. A big part of the equation is what's on the other side: now that we know how to build a headless backend in Drupal, what's the client? There are many exciting answers, which I'll address in another post, ideally with code samples for AngularJS, Backbone, and others.
There's also exciting movement in the headless direction in WordPress, where the WP-API project aims to have a native REST/JSON server bundled into the 4.1 or 4.2 releases later this/next year. I'll be doing a dive into the potential for those implementations soon as well.
Are you building headless applications? Do you have tips tricks or techniques to share? Let me know and let's spread the word!Blog Categories: EngineeringRelated posts: Headless Websites: What's the big deal?WP REST API - A Superficial Review Tweet
Drupal has a nice Internal tool to block IP addresses. It is available in core with no additional modules required. It can be accessed via Configuration -> People -> IP Address Blocking.
But it is practically useless without any automation to control spammers as it requires each IP to be manually submitted by the admin.
And there are the suite of modules available for Drupal. Ranging from captcha to mollom. And all of these target preventing form submission. While they do a good job in preventing the spammer from submitting forms on your site, the spam bots are still able to access your site/form.
And most of the times, there are some really dumb spam boths that do not bother whether they have been successful in the spam attempt. They do not realise that the same and they keep attempting to submit the same form repeatedly. While cpatcha, mollom, honeypot etc on your site are discarding these form submissions from bots, your site’s resources are being utilised to generate this form and show it again to the bots thousands of times.
And the worst part is that many of these form pages are not really cached allowing capcha etc to function properly. This makes the condition ever worse.
Have you ever wished there was a small module that just blocks a spammer completely after he either submits / attempts to submit a form a dozen times times on your site?
So FBIP is here now!
It keeps a track of form submissions and if some user crosses a threshold that you specify, the user’s IP will be automatically blocked!
It is Leightweight. It does not add any additional tables to your site. It makes use of the Flood Control API available in the core of Drupal to keep a track of submissions per user.
You can choose between tracking either all forms on your site. Or specific form ids.
You can whitelist some IPs that you do not want to be tracked (Like your site administrators)
You can also choose to reset the IP bans at each cron run, if you wish to not to block any user permanently!
Beware Spammer, FBI(p) is watching you!
The organizers of D4D made a conscious effort to gear the first day more toward business, and I relished the opportunity to think more about client communication.
I attended a great session on getting better client feedback, and you can read my favorite client communication tips here.
Another gem of the day was a detailed explanation of copyright and creative commons, and robust list of places to get open source fonts and stock imagery.
And check out window.matchMedia() -- it's a simple way to check if you have hit different breakpoints. (Be sure to grab the polyfill for IE9 and below.)
Last, but not least, was the discussion on streamlining development and testing. We got an overview of Google's Web Starter Kit and all of it's goodies, like live reloading, synchronized browser testing, and a built-in, living style guide. And there was an audible gasp (from me) when they showed what browsersync.io could do; all devices on the network could look at the same local site, and when you scrolled down on one device THEY WOULD ALL SCROLL DOWN. Stunning.
The presentation was interesting, and the dev environment really parallels the dev environment we have home-brewed for ourselves here at Advomatic with a combination of Compass/Sass, Grunt, LiveReload, xip.io, and KSS. I quickly learned that there aren't many other shops doing this yet, so we couldn't talk the nitty gritty details (like gnarly compile times). So that conversation is to be continued.
I can't wait to hear next year how others are using these tools to improve their workflow.
Home base was this magical Ghery building, the Stata Center, on the MIT campus.