Feed aggregator

Gunnar Wolf: University degrees and sysadmin skills

Planet Debian - Wed, 08/06/2016 - 19:03

I'll tune in to the post-based conversation being held on Planet Debian: Russell Coker wonders about what's needed to get university graduates with enough skills for a sysadmin job, to which Lucas Nussbaum responds with his viewpoints. They present a very contrasting view of what's needed for students — And for a good reason, I'd say: Lucas is an academician; I don't know for sure about Russell, but he seems to be a down-to-the-earth, dirty-handed, proficient sysadmin working on the field. They both contact newcomers to their fields, and will notice different shortcomings.

I tend to side with Lucas' view. That does not come as a surprise, as I've been working for over 15 years in an university, and in the last few years I started walking from a mostly-operative sysadmin in an academic setting towards becoming an academician that spends most of his time sysadmining. Subtle but important distinction.

I teach at the BSc level at UNAM, and am a Masters student at IPN (respectively, Mexico's largest and second-largest universities). And yes, the lack of sysadmin abilities in both is surprising. But so is a good understanding of programming. And I'm sure that, were I to dig into several different fields, I'd feel the same: Student formation is very basic at each of those fields.

But I see that as natural. Of course, if I were to judge people as geneticists as they graduate from Biology, or were I to judge them as topologists as they graduate from Mathematics, or any other discipline in which I'm not an expert, I'd surely not know where to start — Given I have about 20 years of professional life on my shoulders, I'm quite skewed as to what is basic for a computing professional. And of course, there are severe holes in my formation, in areas I never used. I know next to nothing of electronics, my mathematical basis is quite flaky, and I'm a poor excuse when talking about artificial intelligence.

Where am I going with this? An university degree (BSc in English, would amount to "licenciatura" in Spanish) is not for specialization. It is to have a sufficiently broad panorama of the field, and all of the needed tools to start digging deeper and specializing — either by yourself, working on a given field and learning its details as you go, or going through a postgraduate program (Specialization, Masters, Doctorate).

Even most of my colleagues at the Masters in Engineering in Security and Information Technology lack of a good formation in fields I consider essential. However, what does information security mean? Many among them are working on legal implications of several laws that touch our field. Many other are working on authenticity issues in images, audios and other such media. Many other are trying to come up with mathematical ways to cheapen the enormous burden of crypto operations (say, "shaving" CPU cycles off a very large exponentiation). Others are designing autonomous learning mechanisms to characterize malware. Were I as a computing professional to start talking about their research, I'd surely reveal I know nothing about it and get laughed at. That's because I haven't specialized in those fields.

University education should give a broad universal basis to enter a professional field. It should not focus on teaching tools or specific procedures (although some should surely be presented as examples or case studies). Although I'd surely be happy if my university's graduates were to know everything about administering a Debian system, that would be wrong for a university to aim at; I'd criticize it the same way I currently criticize programs that mix together university formation and industry certification as if they were related.

Categories: Elsewhere

ImageX Media: Lions, Tigers, and Bears, Oh My!

Planet Drupal - Wed, 08/06/2016 - 18:54

DrupalCon brings together thousands of people from the Drupal community who use, design for, develop for, and support the platform. It is the heartbeat of the Drupal community, where advancements in the platform are announced, learnings from its users are shared, and where connections that strengthen the community are made. 

Categories: Elsewhere

DrupalCon News: Frontend fatigue? Share your story

Planet Drupal - Wed, 08/06/2016 - 18:27

Are you fatigued as a Frontender?  You are not alone. Frontend developers are moving in rapid waters all of the time. The explosion of frameworks and tools during the last 3 years was supposed to help us, but a feeling of overwhelmingness can hit easily.

The polyglot frontend-er

Our mother tongue is HTML, CSS, JavaScript, and add SVG to the mix. But to help us in our tasks, we've added Sass/Less, multiple templating languages, Markdown, JavaScript transpilers, testing languages, and what not.

Categories: Elsewhere

Lullabot: Adventures with eDrive: Accelerated SSD Encryption on Windows

Planet Drupal - Wed, 08/06/2016 - 18:01

As we enter the age of ISO 27001, data security becomes an increasingly important topic. Most of the time, we don’t think of website development as something that needs tight security on our local machines. Drupal websites tend to be public, have a small number of authenticated users, and, in the case of a data disclosure, sensitive data (like API and site keys) can be easily changed. However, think about all of the things you might have on a development computer. Email? Saved passwors that are obscured but not encrypted? Passwordless SSH keys? Login cookies? There are a ton of ways that a lost computer or disk drive can be used to compromise you and your clients.

If you’re a Mac user, FileVault 2 is enabled by default, so you’re likely already running with an encrypted hard drive. It’s easy to check and enable in System Preferences. Linux users usually have an encrypted disk option during install, as shown in the Ubuntu installer. Like both of these operating systems, Windows supports software-driven encryption with BitLocker.

I recently had to purchase a new SSD for my desktop computer, and I ended up with the Samsung 850 EVO. Most new Samsung drives support a new encryption technology called "eDrive".

But wait - don’t most SSDs already have encryption?

The answer is… complicated.

SSDs consist of individual cells, and each cell has a limited number of program/erase (PE) cycles. As cells reach their maximum number of PE cycles, they are replaced by spare cells. In a naive scenario, write activity can be concentrated on a small set of sectors on disk, which could lead to those extra cells being used up prematurely. Once all of the spare blocks are used, the drive is effectively dead (though you might be able to read data off of it). Drives can last longer if they spread writes across the entire disk automatically. You have data to save, that must be randomly distributed across a disk, and then read back together as needed. Another word for that? Encryption! As the poster on Stack Overflow says, it truly is a ridiculous and awesome hack to use encryption this way.

What most SSDs do is they have an encryption key which secures the data, but is in no way accessible to an end user. Some SSDs might let you access this through the ATA password, but there are concerns about that level of security. In general, if you have possession of the drive, you can read the data. The one feature you do get "for free" with this security model is secure erase. You don’t need to overwrite data on a drive anymore to erase it. Instead, simply tell the drive to regenerate its internal encryption key (via the ATA secure erase command), and BAM! the data is effectively gone.

All this means is that if you’re using any sort of software-driven encryption (like OS X’s FileVault, Windows BitLocker, or dm-crypt on Linux), you’re effectively encrypting data twice. It works, but it’s going to be slower than just using the AES chipset your drive is already using.

eDrive is a Microsoft standard based on TCG Opal and IEEE 1667 that gives operating systems access to manage the encryption key on an SSD. This gives you all of the speed benefits of disk-hosted encryption, with the security of software-driven encryption.

Using eDrive on a Windows desktop has a pretty strict set of requirements. Laptops are much more likely to support everything automatically. Unfortunately, this article isn’t going to end in success (which I’ll get to later), but it turns out that removing eDrive is much more complicated than you’d expect. Much of this is documented in parts on various forums, but I’m hoping to collect everything here into a single resource.

The Setup
  • An SSD supporting eDrive and "ready" for eDrive
  • Windows 10, or Windows 8 Professional
  • A UEFI 2.3.1 or higher motherboard, without any CSMs (Compatibility Support Modules) enabled, supporting EFI_STORAGE_SECURITY_COMMAND_PROTOCOL
  • A UEFI installation of Windows
  • (optionally) a TPM to store encryption keys
  • No additional disk drivers like Intel’s Rapid Storage Tools for software RAID support
  • An additional USB key to run secure erases, or an alternate boot disk
  • If you need to disable eDrive entirely, an alternate Windows boot disk or computer

I’m running Windows 10 Professional. While Windows 10 Home supports BitLocker, it forces encryption keys to be stored with your Microsoft account in the cloud. Honestly for most individuals I think that’s better than no encryption, but I’d rather have solid backup strategies than give others access to my encryption keys.

Determining motherboard compatibility can be very difficult. I have a Gigabyte GA-Z68A-D3-B3, which was upgraded to support UEFI with a firmware update. However, there was no way for me to determine what version of UEFI it used, or a way to determine if EFI_STORAGE_SECURITY_COMMAND_PROTOCOL was supported. The best I can suggest at this point is to try it with a bare Windows installation, and if BitLocker doesn’t detect eDrive support revert back to a standard configuration.

The Install

Samsung disk drives do not ship with eDrive enabled out of the box. That means you need to connect the drive and install Samsung’s Magician software to turn it on before you install Windows to the drive. You can do this from another Windows install, or install bare Windows on the drive knowing it will be erased. Install the Magician software, and set eDrive to "Ready to enable" under “Data Security”.

After eDrive is enabled, you must run a secure erase on the disk. Magician can create a USB or CD drive to boot with, or you can use any other computer. If you get warnings about the drive being "frozen", don’t ignore them! It’s OK to pull the power on the running drive. If you skip the secure erase step, eDrive will not be enabled properly.

Once the disk has been erased, remove the USB key and reboot with your Windows install disk. You must remove the second secure erase USB key, or Window’s boot loader will fail (#facepalm). Make sure that you boot with UEFI and not BIOS if your system supports both booting methods. Install Windows like normal. When you get to the drive step, it shouldn’t show any partitions. If it does, you know secure erase didn’t work.

After Windows is installed, install Magician again, and look at the security settings. It should show eDrive as "Enabled". If not, something went wrong and you should secure erase and reinstall again. However, it’s important to note that “Enabled” here does not mean secure. Anyone with physical access to your drive can still read data on it unless you turn on BitLocker in the next step.

Turning on BitLocker

Open up the BitLocker control panel. If you get an error about TPM not being available, you can enable encryption without a TPM by following this How-To Geek article. As an aside, I wonder if there are any motherboards without a TPM that have the proper UEFI support for hardware BitLocker. If not, the presence of a TPM (and SecureBoot) might be an easy way to check compatibility without multiple Windows installs.

Work your way through the BitLocker wizard. The make or break moment is after storing your recovery key. If you’re shown the following screen, you know that your computer isn’t able to support eDrive.

You can still go ahead with software encryption, but you will lose access to certain ATA features like secure erase unless you disable eDrive. If you don’t see this screen, go ahead and turn on BitLocker. It will be enabled instantly, since all it has to do is encrypt the eDrive key with your passphrase or USB key instead of rewriting all data on disk.

Turning off eDrive

Did you see that warning earlier about being unable to turn off eDrive? Samsung in particular hasn’t publically released a tool to disable eDrive. To disable eDrive, you need physical access to the drive so you can use the PSID printed on the label. You are supposed to use a manufacturer supplied tool and enter this number, and it will disable eDrive and erase any data. I can’t see any reason to limit access to these tools, given you need physical access to the disk. There’s also a Free Software implementation of these standards, so it’s not like the API is hidden. The Samsung PSID Revert tool is out there thanks to a Lenovo customer support leak (hah!), but I can’t link to it here. Samsung won’t provide the tool directly, and require drives to be RMA’ed instead.

For this, I’m going to use open-source Self Encrypting Drive tools. I had to manually download the 2010 and 2015 VC++ redistributables for it to work. You can actually run it from within a running system, which leads to hilarious Windows-crashing results.

C:\Users\andre\msed> msed --scan C:\Users\andre\msed> msed --yesIreallywanttoERASEALLmydatausingthePSID <YOURPSID> \\.\PhysicalDrive?

At this stage, your drive is in the "Ready" state and still has eDrive enabled. If you install Windows now, eDrive will be re-enabled automatically. Instead, use another Windows installation with Magician to disable eDrive. You can now install Windows as if you’ve never used eDrive in the first place.

Quick Benchmarks

After all this, I decided to run with software encryption anyways, just like I do on my MacBook with FileVault. On an i5-2500K, 8GB of RAM, with the aforementioned Samsung 850 EVO:

Before Turning On BitLocker After BitLocker After Enabling RAPID in Magician

RAPID is a Samsung provided disk filter that aggressively caches disk accesses to RAM, at the cost of increased risk of data loss during a crash or power failure.

As you can see, enabling RAPID (6+ GB a second!) more than makes up for the slight IO performance hit with BitLocker. There’s a possible CPU performance impact using BitLocker as well, but in practice with Intel’s AES crypto extensions I haven’t seen much of an impact on CPU use.

A common question about BitLocker performance is if there is any sort of impact on the TRIM command used to maintain SSD performance. Since BitLocker runs at the operating system level, as long as you are using NTFS TRIM commands are properly passed through to the drive.

In Closing

I think it’s fair to say that if you want robust and fast SSD encryption on Windows, it’s easiest to buy a system pre-built with support for it. In a build-your-own scenario, you still need at least two Windows installations to configure eDrive. Luckily Windows 10 installs are pretty quick (10-20 minutes on my machine), but it’s still more effort than it should be. It’s a shame MacBooks don’t have support for any of this yet. Linux support is functional for basic use, with a new release coming out as I write. Otherwise, falling back to software encryption like regular BitLocker or FileVault 2 is certainly the best solution today.

Header photo is a Ford Anglia Race Car, photographed by Kieran White

Categories: Elsewhere

Reproducible builds folks: Reproducible builds: week 58 in Stretch cycle

Planet Debian - Wed, 08/06/2016 - 16:08

What happened in the Reproducible Builds effort between May 29th and June 4th 2016:

Media coverage

Ed Maste will present Reproducible Builds in FreeBSD at BDSCan 2016 in Ottawa, Canada on June 11th.

GSoC and Outreachy updates Toolchain fixes
  • Paul Gevers uploaded fpc/3.0.0+dfsg-5 with a new helper script fp-fix-timestamps, which helps with reproducibility issues of PPU files in freepascal packages.
  • Sascha Steinbiss uploaded a patched version of epydoc to our experimental repository to test a patch for the use_epydoc issue.
Other upstream fixes Packages fixed

The following 53 packages have become reproducible due to changes in their build-dependencies: angband blktrace code-saturne coinor-symphony device-tree-compiler mpich rtslib ruby-bcrypt ruby-bson-ext ruby-byebug ruby-cairo ruby-charlock-holmes ruby-curb ruby-dataobjects-sqlite3 ruby-escape-utils ruby-ferret ruby-ffi ruby-fusefs ruby-github-markdown ruby-god ruby-gsl ruby-hdfeos5 ruby-hiredis ruby-hitimes ruby-hpricot ruby-kgio ruby-lapack ruby-ldap ruby-libvirt ruby-libxml ruby-msgpack ruby-ncurses ruby-nfc ruby-nio4r ruby-nokogiri ruby-odbc ruby-oj ruby-ox ruby-raindrops ruby-rdiscount ruby-redcarpet ruby-redcloth ruby-rinku ruby-rjb ruby-rmagick ruby-rugged ruby-sdl ruby-serialport ruby-sqlite3 ruby-unicode ruby-yajl ruby-zoom thin

The following packages have become reproducible after being fixed:

Some uploads have addressed some reproducibility issues, but not all of them:

Uploads with an unknown result because they fail to build:

  • h2database/1.4.192-1 by Emmanuel Bourg, which forces a specific locale to generate documentation.

Patches submitted that have not made their way to the archive yet:

  • #825764 against docbook-ebnf by Chris Lamb: sort list of globbed files.
  • #825857 against python-setuptools by Anton Gladky: sort list of files in native_libs.txt.
  • #825968 against epydoc by Sascha Steinbiss: traverse lists in sorted order.
  • #826051 against dh-lua by Reiner Herrmann: sort list of Lua versions embedded into control file.
  • #826093 against osc by Alexis Bienvenüe: use SOURCE_DATE_EPOCH for manpage date.
  • #826158 against texinfo by Alexis Bienvenü: use SOURCE_DATE_EPOCH for dates in makeinfo output.
  • #826162 against slime by Alexis Bienvenüe: sort list of contributors locale-independently.
  • #826209 against fastqtl by Chris Lamb: normalize permissions and order in tarball.
  • #826309 against gnupg2 by intrigeri: don't embed hostname and timestamp into gpgv.exe.
Package reviews

45 reviews have been added, 25 have been updated and 25 have been removed in this week.

12 FTBFS bugs have been reported by Chris Lamb and Niko Tyni.

diffoscope development
  • diffoscope 53 was been released by Mattia Rizzolo, with:
    • various improvements on temporary file handling;
    • fix a crash when comparing directories with broken symlinks (#818856);
    • great improvement on the deb(5) support (#818414), by Reiner Herrmann;
    • add FreeBSD packages in --list-tools, by Ed Maste.
  • diffoscope 54 (released shortly after) to address a regression involving --list-tools, where a syntax error prevented proper listing of all tools.
strip-nondeterminism development

Mattia uploaded strip-nondeterminism 0.018-1 which improved support for *.epub files.

tests.reproducible-builds.org Misc.

Last week we also learned about progress of reproducible builds in FreeBSD. Ed Maste announced a change to record the build timestamp during ports building, which is required for later reproduction.

This week's edition was written by Reiner Herrman, Holger Levsen and Chris Lamb and reviewed by a bunch of Reproducible builds folks on IRC.

Categories: Elsewhere

Acquia Developer Center Blog: Drupal 8 Module of the Week: Workbench Moderation

Planet Drupal - Wed, 08/06/2016 - 15:29

Each day, between migrations and new projects, more and more features are becoming available for Drupal 8, the Drupal community’s latest major release. In this series, the Acquia Developer Center is profiling some prominent, useful, and interesting projects--modules, themes, distros, and more--available for Drupal 8. This week: Workbench Moderation.

Tags: acquia drupal planetworkbenchmoderationstate changeworkflow
Categories: Elsewhere

Jonathan Dowland: Some tools for working with Docker images

Planet Debian - Wed, 08/06/2016 - 14:45

For developing complex, real-world Docker images, there are a number of tools that can make life easier.

The first thing to realise is that the Dockerfile format is severely limited. At work, we have eventually outgrown it and it has been replaced with a structured YAML document that is processed into a Dockerfile by a tool called dogen. There are several advantages to this, but I'll point out two: firstly, having data about the image available in a structured format makes automatically deriving technical documentation very easy. Secondly, some of the quirks of Dockerfiles, such as the ADD command respecting the environment's umask, are worked around in the dogen tool.

We have a large suite of integration tests that we run against images to make sure that we haven't introduced regressions during their development. The core of this is the Container Testing Framework, which makes use of the Behave system.

Each command that is run in a Dockerfile generates a new docker image layer. In practice, this can mean a real-world image has a great number of layers underneath it. Docker-dot-com have resisted introducing layer squashing into their tools, but with both hard limits for layers in some of the storage backends, and performance issues for most of the rest, this is a very real issue. Marek Goldmann wrote a squashing tool that we use to control the number of intermediate layers that are introduced by our images.

Finally, even with tools like dogen and ctf, we would like to be able to have more sophisticated tools than shell scripts for configuring images, both at image build time and container run time. We want to do this without introducing extra dependencies inside the images which will not otherwise be used for their operation.

Ansible could be a solution for this, but there are practical issues with relying on it for runtime configuration in our situation. For that reason David Becvarik is designing and implementing Container Configuration Tool, or cct, a framework for performing configuration of containers written in Python.

Categories: Elsewhere

Tanguy Ortolo: Process command line arguments in shell

Planet Debian - Wed, 08/06/2016 - 13:29

When writing a wrapper script, one often has to process the command line arguments to transform them according to his needs, to change some arguments, to remove or insert some, or perhaps to reorder them.

Naive approach

The naive approach to do that is¹:

# Process arguments, building a new argument list new_args="" for arg in "$@" do case "$arg" in --foobar) # Convert --foobar to the new syntax --foo=bar new_args="$args --foo=bar" ;; *) # Take other options as they are new_args="$args $arg" ;; esac done # Call the actual program exec program $new_args

This naive approach is simple, but fragile, as it will break on arguments that contain a space. For instance, calling wrapper --foobar "some file" (where some file is a single argument) will result in the call program --foo=bar some file (where some and file are two distinct arguments).

Correct approach

To handle spaces in arguments, we need either:

  • to quote them in the new argument list, but that requires escaping possible quotes they contain, which would be error-prone, and implies using external programs such as sed;
  • to use an actual list or array, which is a feature of advanced shells such as Bash or Zsh, not standard shell…

… except standard shell does support arrays, or rather, it does support one specific array: the positional parameter list "$@"². This leads to one solution to process arguments in a reliable way, which consists in rebuilding the positional parameter list with the built-in command set --:

# Process arguments, building a new argument list in "$@" # "$@" will need to be cleared, not right now but on first iteration only first_iter=1 for arg in "$@" do if [ "$first_iter" -eq 1 ] then # Clear the argument list set -- first_iter=0 fi case "$arg" in --foobar) set -- "$@" --foo=bar ;; *) set -- "$@" "$arg" ;; esac done # Call the actual program exec program "$@" Notes
  1. I you prefer, for arg in "$@" can be simplified to just for arg.
  2. As a reminder, and contrary to what it looks like, quoted "$@" does not expand to a single field, but to one field per positional parameter.
Categories: Elsewhere

jfhovinne pushed to feature/NEXTEUROPA-11012 at ec-europa/platform-dev

Devel - Wed, 08/06/2016 - 11:59
Jun 8, 2016 jfhovinne pushed to feature/NEXTEUROPA-11012 at ec-europa/platform-dev
  • 956aff4 Revert "NEXTEUROPA-11012: Initial work on Integration Layer Behat tes…
Categories: Networks

Cheppers blog: Exploring Behat Ep. 2: Behat Scenario Selectors

Planet Drupal - Wed, 08/06/2016 - 10:57

In our previous post on the topic, we used output formatters to determine how Behat displays test results. Now we continue with exploring our possibilities on what tests to run together with Behat’s scenario selectors.

Categories: Elsewhere

Lucas Nussbaum: Re: Sysadmin Skills and University Degrees

Planet Debian - Wed, 08/06/2016 - 10:04

Russell Coker wrote about Sysadmin Skills and University Degrees. I couldn’t agree more that a major deficiency in Computer Science degrees is the lack of sysadmin training. It seems like most sysadmins learned most of what they know from experience. It’s very hard to recruit young engineers (freshly out of university) for sysadmin jobs, and the job interviews are often a bit depressing. Sysadmins jobs are also not very popular with this public, probably because university curriculums fail to emphasize what’s exciting about those jobs.

However, I think I disagree rather deeply with Russell’s detailed analysis.

First, Version Control. Well, I think that it’s pretty well covered in university curriculums nowadays. From my point of view, teaching CS in Université de Lorraine (France), mostly in Licence Professionnelle Administration de Systèmes, Réseaux et Applications à base de Logiciels Libres (warning: french), a BSc degree focusing on Linux systems administration, it’s not usual to see student projects with a mandatory use of Git. And it doesn’t seem to be a major problem for students (which always surprises me). However, I wouldn’t rate Version Control as the most important thing that is required for a sysadmin. Similarly Dependencies and Backups are things that should be covered, but probably not as first class citizens.

I think that there are several pillars in the typical sysadmin knowledge.

First and foremost, sysadmins need a good understanding of the inner workings of an operating system. I sometimes feel that many Operating Systems Design courses are a bit too much focused on the “Design” side of things. Yes, it’s useful to understand the low-level mechanisms, and be able to (mentally) recreate an OS from scratch. But it’s also interesting to know how real systems are actually built, and what are the trade-off involved. I very much enjoyed reading Branden Gregg’s Systems Performance: Enterprise and the Cloud because each chapter starts with a great overview of how things are in the real world, with a very good level of detail. Also, addressing OS design from the point of view of performance could be a way to turn those courses into something more attractive for students: many people like to measure, benchmark, optimize things, and it’s quite easy to demonstrate how different designs, or different configurations, make a big difference in terms of performance in the context of OS design. It’s possible to be a sysadmin and ignore, say, the existence of the VFS, but there’s a large class of problems that you will never be able to solve. It can be a good trade-off for a curriculum (e.g. at the BSc level) to decide to ignore most of the low-level stuff, but it’s important to be aware of it.

Students also need to learn how to design a proper infrastructure (that meets requirements in terms of scalability, availability, security, and maybe elasticity). Yes, backups are important. But monitoring is, too. As well as high availability. In order to scale, it’s important to be able to automatize stuff. Russell writes that Sysadmins need some programming skills, but that’s mostly scripting and basic debugging. Well, when you design an infrastructure, or when you use configuration management tools such as Puppet, in some sense, you are programming, and in terms of needs to abstract things, it’s actually similar to doing object-oriented programming, with similar choices (should I use that off-the-shelf puppet module, or re-develop my own? How should everything fit together?). Also, when debugging, it’s often useful to be able to dig into code, understand what the developer was trying to do, and if the expected behavior actually matches what you are seeing. It often results in spending a lot of time to create a one-line fix, and it requires very advanced programming skills. Again, it’s possible to be a sysadmin with only limited software development knowledge, but there’s a large class of things that you are unlikely to address properly.

I think that what makes sysadmins jobs both very interesting and very challenging is that they require a very wide range of knowledge. There’s often the ability to learn about new stuff (much more than in software development jobs). Of course, the difficult question is where to draw the line. What is the sysadmin knowledge that every CS graduate should have, even in curriculums not targeting sysadmin jobs? What is the sysadmin knowledge for a sysadmin BSc degree? for a sysadmin MSc degree?

Categories: Elsewhere

Russell Coker: Sysadmin Skills and University Degrees

Planet Debian - Wed, 08/06/2016 - 08:10

I think that a major deficiency in Computer Science degrees is the lack of sysadmin training.

Version Control

The first thing that needs to be added is the basics of version control. CVS (which is now regarded as obsolete) was initially released when I was in the first year of university. But SCCS and RCS had been in use for some time. I think that the people who designed my course were remiss in not adding any mention of version control (not even strategies for saving old versions of your work), one could say that they taught us about version control by letting us accidentally delete our assignments. :-#

If a course is aimed at just teaching programmers (as most CS degrees are) then version control for group assignments should be a standard part of the course. Having some marks allocated for the quality of comments in the commit log would also be good.

A modern CS degree should cover distributed version control, that means covering Git as it’s the most popular distributed version control system nowadays.

For people who want to work as sysadmins (as opposed to developers who run their own PCs) a course should have an optional subject for version control of an entire system. That includes tools like etckeeper for version control of system configuration and tools like Puppet for automated configuration and system maintenance.

Dependencies

It’s quite reasonable for a CS degree to provide simplified problems for the students to solve so they can concentrate on one task. But in the real world the problems are more complex. One of the more difficult parts of managing real systems is dependencies. You have issues of header files etc at compile time and library versions at deployment. Often you need a program to run on systems with different versions of the OS which means making it compile for both and deal with differences in behaviour.

There are lots of hacky things that people do to deal with dependencies in systems. People link compiled programs statically, install custom versions of interpreters in user home directories or /usr/local for daemons, and do many other things. These things can have bad consequences including data loss, system downtime, and security problems. It’s not always wrong to do such things, but it’s something that should only be done with knowledge of the potential consequences and a plan for mitigating them. A CS degree should teach the potential advantages and disadvantages of these options to allow graduates to make informed decisions.

Backups

I’ve met many people who call themselves computer professionals and think that backups aren’t needed. I’ve seen production systems that were designed in a way that backups were impossible. The lack of backups is a serious problem for the entire industry.

Some lectures about backups could be part of a version control subject in a general CS degree. For a degree that majors in Sysadmin at least one subject about backups is appropriate.

For any backup (even backing up your home PC) you should have offsite backups to deal with fire damage, multiple backups of different ages (especially important now that encryption malware is a serious threat), and a plan for how fast you can restore things.

The most common use of backups is to deal with the case of deleting the wrong file. Unfortunately this case seems to be the most rarely mentioned.

Another common situation that should be covered is a configuration error that results in a system that won’t boot correctly. It’s a very common problem and one that can be solved quickly if you are prepared but which can take a long time if you aren’t.

For a Sysadmin course it is important to cover backups of systems in remote datacenters.

Hardware

A good CS degree should cover the process of selecting suitable hardware. Programmers often get to advise on the hardware used to run their code, especially at smaller companies. Reliability features such as RAID, ECC RAM, and clustering should be covered.

Planning for upgrades is a very important part of this which is usually not taught. Not only do you need to plan for an upgrade without much downtime or cost but you also need to plan for what upgrades are possible. Next year will your system require hardware that is more powerful than you can buy next year? If so you need to plan for a cluster now.

For a Sysadmin course some training about selecting cloud providers and remote datacenter hosting should be provided. There are many complex issues that determine whether it’s most appropriate to use a cloud service, hosted virtual machines, hosted physical servers managed by the ISP, hosted physical servers purchased by the client, or on-site servers. Often a large system will involve 2 or more of those options, even some small companies use 3 or more of those options to try and provide the performance and reliability they need at a price they can afford.

We Need Sysadmin Degrees

Covering the basic coding skills takes a lot of time. I don’t think we can reasonably expect a CS degree to cover all that and also give good coverage to sysadmin work. While some basic sysadmin skills are needed by every programmer I think we need to have separate majors for people who want a career in system administration.

Sysadmins need some programming skills, but that’s mostly scripting and basic debugging. Someone who’s main job is as a sysadmin can probably expect to never make any significant change to a program that’s more than 10,000 lines long. A large amount of the programming in a CS degree can be replaced by “file a bug report” for a sysadmin degree.

This doesn’t mean that sysadmins shouldn’t be doing software development or that they aren’t good at it. One noteworthy fact is that it appears that the most common job among developers of the Debian distribution of Linux is System Administration. Developing an OS involves some of the most intensive and demanding programming. But I think that more than a few people who do such work would have skipped a couple of programming subjects in favour of sysadmin subjects if they were given a choice.

Suggestions

Did I miss anything? What other sysadmin skills should be taught in a CS degree?

Do any universities teach these things now? If so please name them in the comments, it is good to help people find universities that teach them what they want to learn and help them in their career.

Related posts:

  1. university degrees Recently someone asked me for advice on what they can...
  2. A Better University I previously wrote about the financial value of a university...
  3. The Financial Value of a University Degree I’ve read quite a few articles about the value of...
Categories: Elsewhere

Francois Marier: Simple remote mail queue monitoring

Planet Debian - Wed, 08/06/2016 - 07:30

In order to monitor some of the machines I maintain, I rely on a simple email setup using logcheck. Unfortunately that system completely breaks down if mail delivery stops.

This is the simple setup I've come up with to ensure that mail doesn't pile up on the remote machine.

Server setup

The first thing I did on the server-side is to follow Sean Whitton's advice and configure postfix so that it keeps undelivered emails for 10 days (instead of 5 days, the default):

postconf -e maximal_queue_lifetime=10d

Then I created a new user:

adduser mailq-check

with a password straight out of pwgen -s 32.

I gave ssh permission to that user:

adduser mailq-check sshuser

and then authorized my new ssh key (see next section):

sudo -u mailq-check -i mkdir ~/.ssh/ cat - > ~/.ssh/authorized_keys Laptop setup

On my laptop, the machine from where I monitor the server's mail queue, I first created a new password-less ssh key:

ssh-keygen -t ed25519 -f .ssh/egilsstadir-mailq-check cat ~/.ssh/egilsstadir-mailq-check.pub

which I then installed on the server.

Then I added this cronjob in /etc/cron.d/egilsstadir-mailq-check:

0 2 * * * francois /usr/bin/ssh -i /home/francois/.ssh/egilsstadir-mailq-check mailq-check@egilsstadir mailq | grep -v "Mail queue is empty"

and that's it. I get a (locally delivered) email whenever the mail queue on the server is non-empty.

There is a race condition built into this setup since it's possible that the server will want to send an email at 2am. However, all that does is send a spurious warning email in that case and so it's a pretty small price to pay for a dirt simple setup that's unlikely to break.

Categories: Elsewhere

Virtuoso Performance: wordpress_migrate now has a Drupal 8 release

Planet Drupal - Wed, 08/06/2016 - 06:04
wordpress_migrate now has a Drupal 8 release

So, last week the next arena for me to work on came up as XML/JSON source plugins (see my companion piece for more on that). My intention had been to hold off tackling wordpress_migrate until "nailing down" the XML parser plugin it depends on, but I decided that at least trying to prototype a WordPress migration would be a good test of the XML plugin. The initial attempt was promising enough that I kept going... and going - we've now got a (very) basic D8 dev release!

The UI

The user interface works generally as it did under Drupal 7 - upload the XML file on the first page of a wizard, configure taxonomies, then content, then review. It's important to note that this UI is for configuring the WordPress migration process - it creates the migrations, which you can then run using migrate_tools.

To begin, visit the migration dashboard at /admin/structure/migrate:

Clicking "Add import from WordPress" starts the wizard, where you can upload your XML file:

Clicking Next provides options for handling authors:

Next you can select which Drupal vocabularies (if any) to use for your WordPress tags and categories:

Then, choose the Drupal content types to use for WordPress posts and pages:

You may omit either one (actually, you could omit both if all you wanted to import were authors, tags and/or vocabularies!). Following that, for each content type you selected you can choose the Drupal text format to use for the body content:

In the review step, you can choose the machine name of the migration group containing your WordPress migrations, and also a prefix added to each generated migration's original machine name (if you were to import multiple WordPress blogs into your site, choosing distinct values here will keep them straight).

When you click Finish, you are brought to the migration dashboard for your new group:

Soon, you should be able to run your migration from this dashboard - for now, you'll need to use drush (which, really, you should use anyway for running migrations).

The drush command

Speaking of drush, you can configure your migrations through drush instead of stepping through the UI. Most of these options should be self-evident - do drush help wordpress-migrate-configure for more information.

drush wordpress-migrate-generate private://wordpress/nimportable.wordpress.2016-06-02.xml --group-id=test --prefix=my_ --tag-vocabulary=tags --category-vocabulary=wordpress_categories --page-type=page --page-text-format=restricted_html --post-type=article --post-text-format=full_html

Using drush for configuration has some advantages:

  1. If you're testing the import process, particularly tweaking the settings, it's much quicker to reissue the command line (possibly with minor edits) than to step through the UI.
  2. Scriptability!
  3. If your WordPress site is large, uploading the XML through the UI may run into file upload limits or timeout issues - alternatively, you can copy the XML file directly to your server and configure the migration to point to where you put it.
Ctools Wizard

This was my first time using the Ctools wizard API, and it's really easy to create step-by-step UIs - even dynamic ones (where not all the steps are determined up-front). Basically:

  1. Set up two routes in example.routing.yml, one for the landing page of your wizard, and one to reflect the specific steps (containing a {step} token).
  2. Create a class extending FormWizardBase.
  3. Implement getRouteName(), returning the step route from above.
  4. The key - implement getOperations() to tell the wizard what your steps are (and their form classes):

  public function getOperations($cached_values) {
    $steps = [
      'source_select' => [
        'form' => 'Drupal\wordpress_migrate_ui\Form\SourceSelectForm',
        'title' => $this->t('Data source'),
      ],
      'authors' => [
        'form' => 'Drupal\wordpress_migrate_ui\Form\AuthorForm',
        'title' => $this->t('Authors'),
      ],
      'vocabulary_select' => [
        'form' => 'Drupal\wordpress_migrate_ui\Form\VocabularySelectForm',
        'title' => $this->t('Vocabularies'),
      ],
      'content_select' => [
        'form' => 'Drupal\wordpress_migrate_ui\Form\ContentSelectForm',
        'title' => $this->t('Content'),
      ],
    ];
    // Dynamically add the content migration(s) that have been configured by
    // ContentSelectForm.
    if (!empty($cached_values['post']['type'])) {
      $steps += [
        'blog_post' => [
          'form' => 'Drupal\wordpress_migrate_ui\Form\ContentTypeForm',
          'title' => $this->t('Posts'),
          'values' => ['wordpress_content_type' => 'post'],
        ],
      ];
    }
    if (!empty($cached_values['page']['type'])) {
      $steps += [
        'page' => [
          'form' => 'Drupal\wordpress_migrate_ui\Form\ContentTypeForm',
          'title' => $this->t('Pages'),
          'values' => ['wordpress_content_type' => 'page'],
        ],
      ];
    }
    $steps += [
      'review' => [
        'form' => 'Drupal\wordpress_migrate_ui\Form\ReviewForm',
        'title' => $this->t('Review'),
        'values' => ['wordpress_content_type' => ''],
      ],
    ];
    return $steps;
  }

Particularly note how the content-type-specific steps are added based on configuration set in the content_select step, and how they use the same form class with an argument passed to reflect the different content types they're handling.

Your form classes should look pretty much like any other form classes, with one exception - you need to put the user's choices where the wizard can find them. For example, in the VocabularySelectForm class:

  public function submitForm(array &$form, FormStateInterface $form_state) {
    $cached_values = $form_state->getTemporaryValue('wizard');
    $cached_values['tag_vocabulary'] = $form_state->getValue('tag_vocabulary');
    $cached_values['category_vocabulary'] = $form_state->getValue('category_vocabulary');
    $form_state->setTemporaryValue('wizard', $cached_values);
  }

Next steps

Now, don't get too excited - wordpress_migrate is very basic at the moment, and doesn't yet support importing files or comments. I had a couple of people asking how they could help move this forward, which was difficult when there was nothing there yet - now that we have the foundation in place, it'll be much easier for people to pick off one little (or big;) bit to work on. Having spent more time than I intended on this last week, I need to catch up in other areas so won't be putting much more time into wordpress_migrate immediately, but I'm hoping I can come back to it in a couple of weeks to find a few community patches to review and commit.

mikeryan Tue, 06/07/2016 - 23:04 Tags
Categories: Elsewhere

Virtuoso Performance: Drupal 8 plugins for XML and JSON migrations

Planet Drupal - Wed, 08/06/2016 - 06:03
Drupal 8 plugins for XML and JSON migrations

I put some work in last week on implementing wordpress_migrate for Drupal 8 (read more in the companion piece). So, this seems a good point to talk about the source plugin work that's based on, supporting XML and JSON sources in migrate_plus

History and status of the XML and JSON plugins

Last year Mike Baynton produced a basic D8 version of wordpress_migrate, accompanied by an XML source plugin. Meanwhile, Karen Stevenson implemented a JSON source plugin for Drupal 8. The two source plugins had distinct APIs, and differing configuration settings, but when you think about it they really only differ in the parsing of the data - they are both file-oriented (may be read via HTTP or from a local filesystem) unlike SQL, both require a means to specify how to select an item ("row") from within the data, and a means to specify how to select fields from within an item. I felt there should be some way to share at least a common interface between the two, if not much of the implementation.

So, in migrate_plus I have implemented an Url source plugin (please weigh in with suggestions for a better name!) which (ideally) separates the retrieval of the data (using a fetcher plugin) from parsing of the data (using a parser plugin). There are currently XML and JSON parser plugins (based on Mike Baynton's and Karen Stevenson's original work), along with an HTTP fetcher plugin. All of the former migrate_source_xml functionality is in migrate_plus now, so that module should be considered deprecated. Not everything from migrate_source_json is yet in migrate_plus - for example, the ability to specify HTTP headers for authentication, which in the new architecture should be part of the HTTP fetcher and thus available for both XML and JSON sources. Since no new work is going into migrate_source_json at this point, the best way forward for JSON migration support is to contribute to beefing up the migrate_plus version of this support.

Using the Url source plugin with the XML parser plugin

The migrate_example_advanced submodule of migrate_plus contains simple examples of both XML and JSON migrations from web services. Here, though, we'll look at at a more complex real-world example - migration from a WordPress XML export.

The outermost element of a WordPress export is <rss> - within that is a <channel> element, which contains all the exported content - authors, tags and categories, and content items (posts, pages, and attachments). Here's an example of how tags are represented:

<rss>
  <channel>
    ...
    <wp:tag>
      <wp:term_id>6859470</wp:term_id>
      <wp:tag_slug>a-new-tag</wp:tag_slug>
      <wp:tag_name><![CDATA[A New Tag]]></wp:tag_name>
    </wp:tag>
    <wp:tag>
      <wp:term_id>18</wp:term_id>
      <wp:tag_slug>music</wp:tag_slug>
      <wp:tag_name><![CDATA[Music]]></wp:tag_name>
    </wp:tag>
    ...
  </channel>
</rss>

The source plugin configuration to retrieve this data looks like the following (with comments added for annotation). The configuration for a JSON source would be nearly identical.

source:
  # Specifies the migrate_plus url source plugin.
  plugin: url
  # Specifies the http fetcher plugin. Note that the XML parser does not actually use this,
  # see below.
  data_fetcher_plugin: http
  # Specifies the xml parser plugin.
  data_parser_plugin: xml
  # One or more URLs from which to fetch the source data (only one for a WordPress export).
  # Note that in the actual wordpress_migrate module, this is not builtin to the wordpress_tags.yml
  # file, but rather saved to the migration_group containing the full set of WP migrations
  # from which it is merged into the source configuration.
  urls: private://wordpress/nimportable.wordpress.2016-06-03.xml
  # For XML, item_selector is the xpath used to select our source items (tags in this case).
  # For JSON, this would be an integer depth at which importable items are found.
  item_selector: /rss/channel/wp:tag
  # For each source field, we specify a selector (xpath relative to the item retrieved above),
  # the field name which will be used to access the field in the process configuration,
  # and a label to document the meaning of the field in front-ends. For JSON, the selector
  # will be simply the key for the value within the selected item.
  fields:
    -
      name: term_id
      label: WordPress term ID
      selector: wp:term_id
    -
      name: tag_slug
      label: Analogous to a machine name
      selector: wp:tag_slug
    -
      name: tag_name
      label: 'Human name of term'
      selector: wp:tag_name
  # Under ids, we specify which of the source fields retrieved above (tag_slug in this case)
  # represent our unique identifier for the item, and the schema type for that field. Note
  # that we use tag_slug here instead of term_id because posts reference terms using their
  # slugs.
  ids:
    tag_slug:
      type: string

Once you've fully specified the source in your .yml file (no PHP needed!), you simply map the retrieved source fields normally:

process:
  # In wordpress_migrate, the vid mapping is generated dynamically by the configuration process.
  vid:
    plugin: default_value
    default_value: tags
  # tag_name was populated via the source plugin configuration above from wp:tag_name.
  name: tag_name

Above we pointed out that the XML parser plugin does not actually use the fetcher plugin. In an ideal world, we would always separate fetching from parsing - however, in the real world, we're making use of existing APIs which do not support that separation. In this case, we are using PHP's XMLReader class in our parser - unlike other PHP XML APIs, this does not read and parse the entire XML source into memory, thus is essential for dealing with potentially very large XML files (I've seen WordPress exports upwards of 200MB). This class processes the source incrementally, and completely manages both fetching and parsing, so as consumers of that class we are unable to make that separation. There is an issue in the queue to add a separate XML parser that would use SimpleXML - this will be more flexible (providing the ability to use file-wide xpaths, rather than just item-specific ones), and also will permit separating the fetcher.  

Much more to do!

What we have in migrate_plus today is (almost) sufficient for WordPress imports, but there's still a ways to go. The way fetchers and parsers interact could use some thought; we need to move logically HTTP-specific stuff out of the general fetcher base class, etc. Your help would be much appreciated - particularly with JSON sources, since I don't have handy real-world test data for that case.

mikeryan Tue, 06/07/2016 - 23:03 Tags
Categories: Elsewhere

Gizra.com: Drupal 8: Migrate Nodes with Attachments Easily

Planet Drupal - Wed, 08/06/2016 - 06:00

Drupal-8-manina is at its highest. Modules are being ported, blog posts are being written, and new sites are being coded, so we in Gizra decided to join the party.

We started with a simple site that will replace an existing static site. But we needed to migrate node attachments, and we just couldn’t find an existing solution. Well, it was time to reach out to the community

Any example of #Drupal 8 migration of files/ images out there? (including copy from source into public:// )

— Amitai Burstein (@amitaibu) April 8, 2016

A few minutes after the tweet was published, we received a great hint from the fine folks at Evoloving Web. They were already migrating files into Drupal 8 from Drupal 7, and were kind enough to blog post about it.

However, we were still missing another piece of the puzzle, as we also wanted to migrate files from an outside directory directly into Drupal. I gave my good friend @jsacksick a poke (it’s easy, as he sits right in front of me), and he gave me the answer on a silver platter.

Post has a happy end - we were able to migrate files and attach to a node!

Continue reading…

Categories: Elsewhere

DrupalEasy: DrupalEasy Podcast 178 - Rifftrax - Erik Peterson

Planet Drupal - Wed, 08/06/2016 - 04:40

Direct .mp3 file download.

Rifftrax is the movie commentary web product from former members of Mystery Science Theatre 3000 (powered by Drupal). Erik Peterson (torgospizza) is the Lead web architect, art “director”, sometimes-writer (marketing and social media copy) for the site. Mike and Ryan interview Erik about the cultural phenomenon which is also a verb (to "MiSTy" a film is to watch it while riffing).

Interview

Erik is a Contributor to Drupal Commerce and Ubercart before it. He is also a Co-maintainer of Commerce Stripe and creator of Commerce Cart Link.

DrupalEasy News Three Stories Sponsors Picks of the Week Upcoming Events Follow us on Twitter Five Questions (answers only)
  1. Photography such as this Raven photo
  2. PHPStorm 10 (EAP)
  3. Giraffe @ SD Zoo
  4. Manage a team, teach for passion. Play music in my spare time. (What’s that?)
  5. Matt Glaman He’s well-versed in pretty much everything, and wrote a D8 cookbook.
Intro Music

MST3K Theme Song, California Lady (Gravy)

Subscribe

Subscribe to our podcast on iTunes, Google Play or Miro. Listen to our podcast on Stitcher.

If you'd like to leave us a voicemail, call 321-396-2340. Please keep in mind that we might play your voicemail during one of our future podcasts. Feel free to call in with suggestions, rants, questions, or corrections. If you'd rather just send us an email, please use our contact page.

Categories: Elsewhere

Pages

Subscribe to jfhovinne aggregator