Dick Moore serves as Director of Technology at Ufi, where he looks after four teams that design, build and maintain learndirect’s IT infrastructure. The concept of a ‘University for Industry’ led to the creation of Ufi, which in turn serves as an umbrella organization supporting learndirect. Learndirect is the world’s largest publicly funded e-learning platform with in excess of 2.5 million learners.
Author - Dick Moore, "In-source, out-source, open-source, right-source". Originally submitted July 25th, 2007 to the OSS and OER in Education Series, Terra Incognita blog (Penn State World Campus), edited by Ken Udas.
“The concept of a ‘University for Industry’ led to the creation of Ufi in 1998. The organisation then set-up learndirect, a nationally recognised brand for learning. In six years learndirect has become the largest e-learning network of its kind in the world, and has individualised the delivery of learning to a mass audience through a unique combination of flexibility, accessibility and support.”
In this piece I plan to talk a bit about our e-learning platform and the part that open source tools and systems have played in our success.
VOLUMES
447,000 learners last year
4,000 concurrent learners at peak
consuming 70 mb/s of bandwidth
99.98% systems availability
The learndirect learner management system (LMS) like most learning management systems is more than a website with lots of content.
Content sites like the BBC or CNN while they have some personalisation, typically present their consumers with a collection of web pages. If they are personalised at all they present their consumers with a sub-set of content according to preferences or tracked activity. Critically, the content itself does not change from consumer to consumer and as a result can be load-balanced across a number of serves or caches and requires relatively little tracking.
Learner management systems such as the learndirect system track a learner’s progress through a piece of learning and adapt in response to on-programme formative assessment. Such systems do expect to modify content according to consumer behaviour and as a result the use of multiple content servers only works to an extent. Such systems require a single authoritative data source for each course.
Additionally, consumers visiting a news or similar site have plenty of choice. If the BBC site is slow or not there for whatever reason, there are plenty of other such sites for a consumer to visit.
With web delivered learning, the consumer is intending to engage in a formal learning activity that they have formally enrolled in and in many cases have traveled to one of our learning centres to take their course. There is no other site for them to go to. If the site is slow or closed, then their journey was a waste of time.
For this reason the system must be both available and perform well. It is not enough that a system is available and returns content. If e-learning is to be effective, the medium needs to be as un-intrusive as possible; content has to render without the consumer becoming aware of any wait.
This presents us with a double bind; each user’s content is customised and there is a service expectation of 100% availability and responsiveness. In addition, we have issues of large scale and 24 x 7 availability we can see that constructing such a service is a serious web engineering exercise.
If you are not monitoring the service, then you are just running software.
It’s never good when the first person to tell you that your service has a problem is one of your consumers. Without appropriate monitoring software this will inevitably be the case, and in all probability they won’t tell you immediately.
So, the first key differentiator between a service and a system is Monitoring.
When our service was first constructed a very expensive piece of software was purchased to perform availability monitoring, however, Mr. Heisenberg was forgotten and the load associated with that particular tool was sufficient to detrimentally impact the system. The tool itself was sold as the usual universal panacea, however, in implementation it was clear that its forte was component monitoring and not service monitoring.
Running a live system with this tool gave us all sorts of problems. The tool required agents on all machines and was really only designed around component availability and even then this was often measured from the wrong place (inside the firewall).
We took a look at the open source offerings available at that time and selected two.
Nagios has won lots of awards. We use it to monitor events from two locations.
Our DMZ where it looks at all of our components every 90 seconds and critically has thresholds set for Green, Amber and Red. While most components in our large system are duplicated to provide resilience, it’s absolutely vital to know when one of your resilient components has failed in order to prevent a systems failure.
The public Internet. From this location, we can look at the service(s) from the perspective of the end user.
Nagios is used to provide event monitoring. Implementing such a tool is not to be undertaken lightly. Getting the sensitivity correct so as not to cry wolf, and embedding the culture such that when an alert is sent out, the operational staff respond rapidly is, in my opinion, more difficult than installing the system in the first place.
The second open source monitoring tool we use provides trend monitoring, After looking around we found Cacti.
While Nagios tells us when we have a specific issue/problem, Cacti provides us with the information to understand or diagnose the root cause. In measuring volumes and their trends, Cacti allows us to look across the whole application stack at any point in time and examine critical volumes.
Cacti is used to measure volumes. If a system can return a number, Cacti can capture, store and trend it. These volumes can be business or technical volumes examples of which might include the number of users logged into the system over time or critical system volumes such as bandwidth, disk space, CPU, or Memory usage.
When you want to compare historical volumes or activity at a particular moment in time, Cacti can provide it.
As you might expect from an open source tool set, both of these tools are highly extensible. We have been able to write and adapt agents to interface with them, with the exception of our database monitoring, and we have been able to monitor and trend all our services.
I spoke above about getting the culture right, putting these critical volumes onto big flat screens, making them obvious to everyone in your operations and service team. This was the single most important cultural change we made next to implementing an ITIL service culture.
The real question here is how we’ve been allowed to put all this instrumentation all over our application. Most government contracts are outsourced, but we chose to in-source our operations and development teams.
To understand this we need to talk about technology in a business context. Most organisations have either an implicit or explicit technology strategy. Within our organisation our Technology Strategy provides us with a framework that allows an organisation to make ‘good’, strategic choices, i.e. Hardware, software, monitoring systems, hosting providers. These choices are deployed within a governance framework to ensure that the business and service models that are dependant on technology can be delivered now and the future.
At the risk of stating the obvious, the selection of technology and service model an organisation chooses can mean the difference between a successful business and one that fails. As a consequence, organisations and IT directors tend to be conservative in there decision making.
At a simplistic level, technology is used for three things within an organisation:
to run the business
to change the business
to innovate
Unless you are a start-up, the bulk of investment and cost is already sunk in running your company. Changing the company IS usually occurs incrementally and takes the form of modifying the status quo. We are left with the shinny innovation tip of the cost iceberg to introduce new ways of doing things.
If we accept some of the above, we can see that technology strategies have considerable inertia, and unless there are some strong external pressures (failure to meet Service levels, company financial pressure, loss of market share), the adoption of new technologies is going to be slow. There is still a lot of COBOL out there!
So if you already don’t have a lot of open source in use, introducing it requires overcoming quite a lot of inertia.
As a company we have mandated the use of specific open-source operating systems and applications within our technology strategy where we can see cost and risk reduction. It’s worth saying that if our service was totally outsourced then these would not be our choices to make, other than at contracting and its very dangerous form to tell a supplier both what you want and how to do it.
The last ten years has seen the trend to out-source IT services and development continue to increase. This should not be a surprise when we consider the risk and cost of getting it wrong. Out-source companies come with the allure of having solved all problems previously and having a large pool of experienced staff and many organisations have significantly reduced the cost and risk of running their IT systems as a result.
Central to a successful out-source contract is a contract and a service description and underpinning set of requirements that are well defined. Good example candidates for out-sourcing are Payroll or Desktop management. In both cases, an organisation can describe what it is that it wants and the amount of change required going into the future can be estimated accurately.
It’s in the nature of our-sourced contracts that you describe to the supplier what you want but refrain from telling them how to do it.
If your IT application is the core of what your organisation does (such as the learndirect LMS) and you know you are going to undergo an annual cycle of change then in-sourcing your operations should be considered.
Having in-sourced the learndirect operations, we have seen a significant reduction in cost and have increased our service availability to > 99.9%
If you have in-sourced your application development or hosting then you have the opportunity to exploit open source tools and applications for competitive or service advantage (are they the same thing?)
Having in-sourced the operations and now the development of our core application, we have put open source technology at the core of our technology strategy.
While we retain Oracle as our database of choice we have adopted a wide range of open source tools, Apache, SQUID, JBOSS, Hybernate, MySQL, Linux, to name but a few.
The advantages are obvious:
They are standards compliant, or effectively comprise a cross-platform standard in their own right.
They are robust and open to peer review such that issues and problems are rapidly identified and resolved
They are often designed and built by practitioners and as such have solutions for real world problems built into them
They increasingly come with support contracts
Looking back on what I have written it’s a bit rambling, however the key points I want to make are.
Don’t confuse running a Service and running an application. Monitoring and non-functional requirements such as usability, supportability, maintainability, availability make the difference.
Monitoring and its application is critical in running a service
Getting a technology strategy that supports the business and recognizes that once started it’s often expensive to change.
In-sourcing /out-sourcing right-sourcing will impact what you have control of.
Open source tools can be used to run world class infrastructure.
I hope you found something to make you think in this piece. We live in amazing times. The richest person in the world 10 years ago did not have one tenth of the knowledge we now have at our fingertips. Lastly, in the words of my favourite bumper sticker of all time, if you think that education is expensive try ignorance.
Its beholding on me to state that the views expressed in this piece are my own and do not necessarily represent those of my organisation.
7 Responses to “Running a Service Not a System”
Dick, First, thank you for this post. I am interested is teasing out connections between your use of OSS, technology strategy, and impact on how the Ufi / learndirect supports learners. Referring back to one of your summary points:
Getting a technology strategy that supports the business and recognizes that once started it’s often expensive to change.
I am wondering if your use of open source software has influenced your technology strategy formally, and if there is any explicit connection in the formal technology strategy and the organization’s strategy involving education services.
Cheers
Ken
Thanks Ken
The except below is taken directly from our Technology Strategy summary
“Service Delivery Platform: Use of open source components such as Apache Web Server and Squid Proxy Server have been core to our service from the beginning. Within the last two years, we have migrated from using Sun Solaris to Redhat Enterprise Linux as our delivery platform. This has provided us with a 95% reduction in cost and has proven to be highly reliable.”
We have similar statements for our Development and Database platforms that all state explicitly the use of Open Source technologies.
In the Uk our Office of Government commerce has over the last few years, suggested that ‘OPEN SOURCE should be actively considered alongside proprietary alternatives’ http://www.ogc.gov.uk/news_2007_4280.asp
It’s my contention in the piece above that this it’s not possible to mandate open source if you out-source your IT provision.
At Ufi, as a result of in-sourcing, we have been able to embed opensource within our technology strategy for non-utility applications and as a result have seen significant cost reductions and improvements in service reliability.
Cheers, Dick
Dick, Hello. I have a few questions and I am not sure where to go first. I guess that I would first like to tease out a little more of your thinking around the connection between supporting individual users that expect a unique experience and using open source software. You started your post by illustrating the importance of system reliability and how you and your team ensure this by recasting the notion of running a “system” into running a “service,” which is supported by excellent monitoring. The need in your context is derived by the challenges of supporting an environment that a) requires customized experiences (there is no place else to go for your learner is the learndirect platform is down, unlike somebody seeking sports scores or news), b) requires high reliability, and c) supports high volume.
You then point to the advantages of in-sourcing these parts of your infrastructure using open source software. Here is my first assertion. It seems that the “customization” criteria in the above mix is most critical, after all, high volume and high reliability are pretty typical reasons to out source. Is that correct? Through your experience, what advantages does OSS potentially provide that proprietary options do not? And, when you are evaluating OSS options, what are some of the evaluation criteria that you prioritize?
Thanks, Ken
Lots of issues there Ken
I would first like to tease out a little more of your thinking around the connection between supporting individual users that expect a unique experience and using open source software
Hmmm while there is no direct connection, using open source software has allowed us to build highly resilient infrastructures that scale. We have a finance and Management Information system that provides performance data a week in arrears.
Using OSS and commodity hardware has meant that that the system has had 100% uptime over the last two years and we are able to scale horizontally keeping pace with demand (downloads have never taken more than 10 seconds)
Here is my first assertion. It seems that the “customisation” criteria in the above mix is most critical, after all, high volume and high reliability are pretty typical reasons to out source.
It’s not the customised end user experience that makes in-sourcing important, it’s the degree to which we wish to customise the end user solution that is the driver. If your solution is subject to minor change then our-sourcing is a good option. Our e-learning platform is our business and we want to be able to make significant changes on a regular basis. Trying to contract manage third parties to both provide 99.9% uptime AND process lots of change is not easy, contracts by their nature are all about defining risk in advance. In-sourcing has allowed us a finer grained management of risk.
Through your experience, what advantages does OSS potentially provide that proprietary options do not?
In many respects using OSS has a similar risk profile to in-sourcing. As a purchaser you always take the risk. Using commercial software you are buying into a solution with the intent of reducing risk. Of course this is often, (though not always), a chimera, commercial software comes with a service contracts and SLA’s though when one hits an significant incompatibility, it’s either very expensive or impossible to have it customised for your application. Don’t misunderstand me, we use commercial and OSS database software. For all critical data I use the commercial provider. Our open source database software provides fantastic value for those applications that require read-only access.
And, when you are evaluating OSS options, what are some of the evaluation criteria that you prioritise?
Good question, we would use much the same criteria that we would for commercial software.
How well established is it
Is it a market leader
What is the size of the user base
Are there third parties providing support
What do the technical people I respect think of this product
How good a fit is it with our other Technical Roadmaps
What is the total cost of ownership over 3 years likely to be
Using the above criteria to evaluate Apache, against other web servers , we decided to use apache :-)
Hope that helps Dick
Dick,
Thanks much for this. I am sort of trying to make some connections between what we have learned through your posting and previous postings. Customization/localization is a major theme in both the open source software postings and the open educational resources postings in this Series, which I find quite interesting. You have introduced a different (or at least what feels like a different) aspect of customization. To this point it seems that most of the dialog about customization has been in recognition that different groups of learners (and faculty and administrators) will have different needs, so content and infrastructure should be localized to meet local cultural, linguistic, access, etc. circumstances and needs. It has been indicated generally that OSS and OER provide better opportunities for localization than proprietary software and educational content. Mara, in our last post, also pointed out some of the challenges associated with the level of customization that OSS can provide can impact on usability testing and user experience.
I know that my next question might not be answerable, but in any event, would you be able to articulate some of the differences between “customization” as you are describing/treating it and “customization” as described above (as treated in previous postings)?
Cheers & Thanks
Ken
Dick,
Hello. Just another little follow-up question. You outlined some of the advantages of using OSS. What were the challenges that you encountered (technical, organizational, etc.)?
Cheers
Ken
Ken