Monday, 13 July 2015

Managing your API

So you have developed a brilliant new web based API, the world will be fighting to get access to it, its the most useful thing ever.  But how do you manage access to it?  How do you protect it, how do you monitor calls to it, how do you restrict access based on who the consumer is?

Microsoft offer a solution to this as part of their Azure offering.  This blog will look at what Microsoft Azure's API Management product offers.

With APIM you can

  • Ensure access to the API is only via APIM
  • Set throttling policies based on users/groups
  • Make users sign-up for access
  • Generate and publish developer documentation easily
  • Monitor traffic
And probably a whole lot more that I have not explored yet.

And all this can be done for multiple APIs all through a single endpoint.  You simply need to have an azure account and create an APIM instance to get started.

I'll look at how to set up some of the things mentioned above, starting with establishing an APIM instance and pointing it to you already published API.

To create an APIM instance, go to the management portal for azure (I use the old style portal at https://manage.windowsazure.com/), select the APIM charm on the left


You will be presented with a list of your instances (but I guess you dont have any), and in the bottom left is the new button to create your first instance.  Clicking this will open a banner with the options of what you want to create, there are no options, so just click create in third column
You will be asked for a URL, this is the endpoint that the world will use to gain access to your API, and a few other questions about where the instance will be hosted, who you are and your contact details, and if you select the advanced settings you can select a pricing tier, but that as the name suggests is an advanced topic, and this is a basic intro blog, so leave the pricing tier as Developer for now.

After that it will go away and spin up an instance for you which will take some time.  At the end of this the list of APIM instances will be populated with a grand total of 1!!!

Next select the name of your new APIM instance and you will be presented with the quick start page, the only page that I honestly use within the azure portal in relation to the APIM instance.


The 'Publisher Portal' is the gateway to everything to do with the APIM instance.  And out of interest the 'Developer Portal' is what the public will see if you tell them where to look for information on you API, i.e. where you can provide documentation.

To finish setting up your vanilla first APIM instance, go into the publisher portal and you will be presented with a dashboard with not a lot of data
The next thing to do is connect APIM to you existing API, which you do by clicking the add API button, and providing the details required.  You need a name for the API, the URL for the API you made earllier, a suffix which will be used to distinguish between the APIs associated with the APIM instance and provide whether you want the access to the API (via APIM) to be allowed with or without TLS.

There are stil two steps to go. firstly defining the operations that can be accessed via the API.  I.e. what verbs are allowed and for what URLs.  Add operations such a GET users using the intuitive dialog
and finally you need to associate a product with the API.  Products are in my opinion badly named.  They represent levels of subscription to the API.  By default 2 products are preconfigured, Starter and Unlimited, you can associate these or any other products with the API using the Products tab on the right of the screen.

After this your new APIM is ready to go.

The next thing you may wish to do is add a throttling policy to the API (or more specifically the product).  You do this by selecting the policies option on the menu in the left, pick the combination of options you want to set the policy for (product, api and operation) and click the add policy option in the text box below. This will add a blank policy, and you can add the details of the policy using the code snippets on the right, so for a simple throttling policy select the limit call rate option, and this will add code to set a limit on the number of calls within a given time window.  By default the starter product is limited to 5 calls in any 60 seconds, and 100 calls in a week
This gives you a flavour of what can be controlled with this.

The use of the products and policies in conjunction allows you to fine grain the access to the API and its operations in a way that is best fitted to you and your users.

The next thing I would look at is securing your API so that the rules set up in APIM must be followed.  If people can go to the API directly, bypassing APIM, then these policies and rules are meaningless.

The simplest way to do this is to use mutual certificates between APIM and your API, and add code into your API to ensure that all requests have come from a source with that certificate.  This can be done by going to the security tab on the API section of the publisher portal
then pick the mutual certificates option in the drop down.  You will need to upload the certificate to the APIM instance, which can be done by clicking the manage certificates button.  In terms of ensuring the request has come from a trusted source, that is a coding question, but for completeness, within a c# MVC webAPI system, add a message handler to the pipeline for the incoming requests by editing the WebAPIConfig class

 public static class WebApiConfig  
 {  
      public static void Register(HttpConfiguration config)  
      {  
           // require client certificate on all calls  
           config.MessageHandlers.Add(new ClientCertificateMessageHandler());  
      }  
 }  
And then add a class to check the certificate:

 public class ClientCertificateMessageHandler : DelegatingHandler  
 {  
      protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)  
      {  
           bool isValid = false;  
           X509Certificate2 cert = request.GetClientCertificate();  
           if (cert != null)  
           {  
                if (cert.Thumbprint.Equals(RoleEnvironment.GetConfigurationSettingValue("ClientCertThumbprint1"), StringComparison.InvariantCultureIgnoreCase)  
                 || cert.Thumbprint.Equals(RoleEnvironment.GetConfigurationSettingValue("ClientCertThumbprint2"), StringComparison.InvariantCultureIgnoreCase))  
                {  
                     isValid = true;  
                }  
           }  
           if (!isValid)  
           {  
                throw new HttpResponseException(request.CreateResponse(HttpStatusCode.Forbidden));  
           }  
           return base.SendAsync(request, cancellationToken);  
      }  
 }  
This allows you to define two certificate thumbprints in the web.config of the API to compare against the incoming request.  All requests via APIM will include the certificate that you upload in the publisher portal, so if you ensure that this cert is not made public, then you can be assured that all requests hitting the API have come through APIM.

I mentioned making users sign up for access.  Well if they want to use your API we have just ensured tha their requests must be directed via APIM.  The rest is built in.  The products that you configured earlier have associated with them a subscription key that the consumer of the API must supply with every request.  This ensures that every consumer of the API must have the subscription key.  The developer portal provides a way to subscribe to your APIM and thus get a key.  You can if you wanted restrict this so that you need to manually give access to every subscriber before they get a key, and that way you could monetize the access, but that is way beyond the scope of this article. Suffice to say they need to sign up to get the key or APIM will reject the requests.

The documentation aspect of APIM is probably something best explored yourself, I'm a techy and not best placed to explain this, however in summary you can document each operation on the page where the operation is created/configured, and the content of the description can take the form of html/javascript so you may wish to use some client side script to retrieve the documentation and manage the content externally.

 <div id="dynamicContent"></div>
<script src="https://myserver/scripts/documentation-ang.js"> </script>  
would call the javascript file which could provide the content required based on some data scraped from the rendered page.

The final thing to look at is the analysis of traffic.  Good news, you don't need to do a thing to configure this.  Basic analytics are provided out of the box.  If you require anything more complex than APIM offers for free you may wish to look at other products that could be bolted on to the system, but in general the data that is captured will provide you with a lot.

So in summary to that all, APIM offers an easy to set up and easy to configure interface to your API when you want to publish it to the world.  It gives easy access to some very good security features and access control.  There are many platforms out there that will host an API for you, but if you want to host you own somewhere, APIM offers a lot of what you need to consider before exposing your delicate little flower to the horror of the web.

Any questions or comments are welcome, and I will try to answer anything you ask.  And please share this with everyone you know, and even those you don't know.  Spread the word, the cloud is fluffy and nice!

Monday, 6 July 2015

Dedication, Motivation and Achieving

It has been said many times that to become an expert in anything you need to practice for 10,000 hours.  This theory has been professed for sporting prowess (see the book Bounce by Matthew Syed) and for technical skills (see John Sonmez's podcast) and many people argue against it.

I see the logic behind the idea, if you do something consistently for that long you will surely iron out a lot of the problems you have with it and become better.  You may not become the best, there may be some innate talent involved that you do not posses.  For sporting prowess, anatomically you may not be able to surpass some other people.  But practice anything for 10,000 hours and you will become a whole lot better than you are right now. One important point is that you probably do need some guidance (training) during these 10,000 hours, otherwise you may develop bad habits that will hold you back and will be difficult to break once they are deeply ingrained.

So coming to the technical skills of programming, Sonmez argues that practicing, repeatedly trying to build systems and applications, whilst trying to extend you skill set  is simply a matter of perseverance and habit.  That you must get yourself into the habit of practicing (coding for instance) on a regular basis (daily?), and that motivation and drive have little to do with it. I disagree with this.

In order to be able to commit and persevere with the practice you need some form of motivation, something to keep you going.  The simplest thing that motivates most people is greed.  Sounds dirty, but its true.  We all live at a level beyond necessity (materially speaking) so we need money to do this, we may not all strive for more more more, but living above the level of having food and shelter could be perceived as greed (I'm not saying we are all greedy, but we all like comfort and the nice things that are accepted as normal in modern living, and these all cost money.  So we all look to have an income, and as programmers/developers/coders writing code is a way of getting this.  So that is a base motivation achieved.  But how does that equate to practicing skills and getting better?  Most of us work for someone else, we have fairly clear objectives of what our employer needs us to deliver in order to maintain that employment and be paid.  This is often not enough to drive us to get better.  We can drift along, doing enough to get by without getting better.  10,000 hours of this will not get you very far.

So how do you get better?  What form will the 10,000 hours of practice take?  Well you could potentially use this employment time, if you use it wisely.  This is easy when you are starting out, all of your time will be spent on new things and so will be useful, but give it a year or so and your day-to-day work will stop stretching you.

10,000 hours equates to 1,250 8 hour days, or nearly 5 years of normal employment (8 hour days, 260 days per year).  So it is not a quick thing.  And if only about the 1st year will really stretch you and be useful practice then these 5 years will not actually amount to the 10,000 hours. So how do you build it up?

Side projects, personal learning, pushing your employer to embrace new ideas and technologies.  That is how.  If the project you are working on is stagnating, look for how it could be moved forward to the benefit of your employer and push for them to do it.  It will benefit you from a skills and proficiency perspective, and benefit them where they like it, in their wallet.

But for side projects and personal learning, the problem is often one of motivation, drive and time.  How do you drive yourself to put the time in?  You have just worked a 40 hour week for the man, where is the motivation to put in more time on top of this?  Yes if you put in the hard yards you will see a payback eventually, but 10,000 hours is a long time, and we are simple beasts, we want dividends now.  Something perceivable and measurable.  That is where the motivation aspect comes in.  That is the motivation.

You need some sort of short term perceivable and measurable metric of your progress and success.  Look for what it is in programming that you find most rewarding.  It could be a wider question than that, what do you find rewarding in life?  Personally I like the feeling of seeing a task completed.  The sense of satisfaction that something I set out to do happened and I succeeded in completing it.  I love hiking, and specifically hiking up mountains, and the greatest sense of achievement in this is to get to the summit.  But reaching a high summit can be a hard slog.  It might only take 5 hours rather than 10,000, but physically I still feel that fatigue in reaching my final goal en route.  What I do is to set intermediate goals along the way.  The next 3 miles, the next false summit, where I want to be before 11am.  Anything to give me a sense that I am doing well and on track for my final goal, the little pile of stones that marks the top.

In motivating myself to do some personal development it is the same.  I set short term goals that will get me to the big prize.  The big prize is the 10,000 hours mark?  No its becoming what you want to be personally.  Pinning what that is is difficult, and right now I can't easily express what that is, but I can set short term goals, and the final goal will come into sight at some point.

For side projects this short term goal setting is easy (although the side project itself should probably be one of your short term goals, but breaking this down further is a good idea) try to work in an Agile manner within the project.  It might be a personal project, with a team of 1, but breaking the development down into short iterations, with a goal of delivering a functioning system at the end of each iteration is a very valid approach still.

If you do this, and your psyche works anything like mine, then at the end of each iteration you will feel a sense of achievement and that will help you to motivate yourself for your next iteration.  You will surprise yourself by how easy it becomes to motivate yourself to complete the tasks of the iteration, because you will want that sense of achievement at the end of the iteration, and quickly the side project will take shape and maybe start to deliver on that base motivation, hard cash!  The great side benefit is that you will push yourself towards that 10,000 hours mark, which is a notional mark to indicate you are becoming a much better developer.

Motivation is important, its not just about turning the wheel, but you must find your own motivation by setting achievable milestones and rewarding yourself in some way that helps to keep you focused on the next step, the next step and eventually the big prize.

Monday, 15 June 2015

Can Adding an API Save Your System?

Creating an api as a publicly accessible interface to your system can actually make your system better than it currently is!

A bold claim I know, but bear with me and I will try to justify this claim. Lets begin by imagining a large system which has evolved over time into something your customers like. It fits the purpose, it earns you money, it continues to grow and evolve. All sounds well. Now look behind the curtain, in the box, under the surface. Those little duck legs are paddling like mad just to keep the serene appearance on the surface. Your system has evolved into something if a mess beneath the surface. You're not at fault, its a fact of any software systems that over time commercial pressures mean that technical debt accrues and the system architecture is corrupted into a mess. Often UI and business logic boundaries are blurred, code duplication becomes rife, maintainability is generally reduced.



A nice clean architecture would have clear separation between the layers of the system in terms of the responsibilities, but more importantly in terms of the code base.  In reality many systems blur these boundaries over time and code that should be in a business layer may be forced into the UI layer by time and commercial pressures.  For a system with one UI this does not surface as a problem, but consider adding a second UI, quickly the business rules that you think are robust and well tested are being violated.  How come? Well they were implemented in the previously sole UI, but are now being bypassed by the new UI.  This highlights the build up of technical debt that putting the code in the wrong place causes.  But as I say, you could live with this happily for years unless you introduce a new UI.

In a clean system the abstraction between layers should be such that any layer could be replaced an the system still function.  If there is an overlap in responsibilities between layers this is not so straightforward.

Given the evolution of the technological landscape to a much more mobile, flexible one, with the desire to access everything from everywhere, there is an obvious drive towards supporting multiple interfaces to a system to accommodate this.  Take a system that was written for a standard office of the past.  It may have a big thick client with maybe just a shared database behind, or a thickish client with a server side engine that does the grunt work and a central database.  To evolve either of these systems a web front end may be produced utilizing the same data.  Where a server side engine existed this may be reused for consistency of functionality and minimal effort to create the new UI.  If however any of the business logic existed in the client this will need to be replicated in the new web service.  If we extend further and add mobile applications the logic will need to be placed in these too.  And what about integrations with third party systems?  Where does the logic sit there?  We need to contain all the logic in a common business layer, just as our clean system architecture planned.

This is a problem I have seen many times over the years, and often when creating this one new UI the decision was taken to do the 'easy' thing at the time and duplicate the logic.  Often this resulted in the logic being implemented inconsistently in the new UI.  And worse, any bug found in either flavour of the system would only be fixed in that one flavour.

Recently I have been working on the addition of a rich RESTful API for a mature system, and where we find a hole in the business logic due to the API bypassing all UI layers the decision has been taken to do the right thing.  Move the business logic into the business logic layer so all UIs have the same logic implemented by the common layer below.

All this sounds like a lot of bad coding has happened in the past, and that bad decisions have been made as to where to put code logic.  But this is not the case in reality.  Image a form in a system that allows you to enter data.  A numeric field is entered, and the form restricts the format of the data.  E.g. non negative, with enforced bounds and precision.  The business layer of the system may well be the place that all the data validation is being performed, but what if some of the boundary conditions that should be guarded against are missed when writing this validation code?  No-one made the decision to miss this validation.  The testers thought of the boundary cases and tested them.  The system held up robustly as the UI did not allow the user to enter the invalid data.  But if the UI had not enforced these restriction then the invalid data may have gotten through.  There was no way to test this.  We thought the system was robust, and it was, but only via the single UI that was available to the testers to explore/exploit the system.

If in the scenario of the ever evolving technological landscape above we add a second UI, for whatever reason these restrictions may not be possible and the system becomes just that bit more fragile.

With an API, we can decide to implement no restrictions (other than type) on the data input and thus force the (common) business layer to take the responsibility for all the data validation.

The decision made to do this on the system I am currently writing the API for means that the API will be a little slower to produce, but more importantly the overall system will end up in a much better state of technical health.  And the biggest benefit from this is that if a new UI is needed in the future that maybe does not even use the API but communicates directly with the business layer, we can be confident that the logic will be intact and robust.

So the addition of an API will not directly save your system, but can give you confidence that the system is fit and healthy enough to evolve further into a rapidly changing world that may put ever more challenging requirements on the core of the system.

Tuesday, 2 June 2015

OAuth in Winforms

The junior dev I am mentoring at my job was recently given the task of extending his knowledge of APIs and asked to demonstrate his newly gained knowledge by producing an application that consumes a third party API. More specifically a RESTful API.

In itself this does not seem the most taxing of tasks, but bear in mind that the junior dev has no web development experience, his only dev exposure in his current role has been on a winforms application. So to make this a targeted task, aimed at learning about RESTful APIs, it was decided a simple winforms application using the 4 main verbs would demonstrate sufficient understanding.

This however did raise a question, how to interact with an OAuth provider from a winforms application.  This should be a simple matter, but it is something that is not well documented, especially in the documentation of most of the more famous APIs.  There are plenty of tutorials for how to authenticate with an OAuth provider from a web site, and most of the APIs the junior dev looked at provided their own OAuth.

The final choice of the API to consume was Instagram, which provides great documentation for its OAuth when being consumed in a web site, but nothing for Winforms.  This is not surprising, Winforms is an old technology, not something that you would expect to be used with a service like Intstagram, but why not?  It should be possible (and is). But it is understandable that Instagram have not invested time in providing detailed documentation on how to do this.  So here we go on how it was accomplished:

Firstly, the method of validating the user's claim of access to Instagram is via a web page hosted by Instagram.  The documentation states
which is fairly straightforward in a web app, but how do you do this in a winform application?

The answer is to host a web browser control within your application which will display the url above and be redirected upon completion of the authorization process.  We found some code with a quick trawl of the search engines to perform this action in a pop up window:

 string authorizationCode = StartTaskAsSTAThread(() => RunWebBrowserFormAndGetCode()).Result;
                 
           private static Task<T> StartTaskAsSTAThread<T>(Func<T> taskFunc)  
           {  
                TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();  
                Thread thread = new Thread(() =>  
                {  
                     try  
                     {  
                          tcs.SetResult(taskFunc());  
                     }  
                     catch (Exception e)  
                     {  
                          tcs.SetException(e);  
                     }  
                });  
                thread.SetApartmentState(ApartmentState.STA);  
                thread.Start();  
                return tcs.Task;  
           }  
           private static string RunWebBrowserFormAndGetCode()  
           {  
                Form webBrowserForm = new Form();  
                WebBrowser webBrowser = new WebBrowser();  
                webBrowser.Dock = DockStyle.Fill;  
                var uri = new Uri(@"https://api.instagram.com/oauth/authorize/?client_id=CLIENT_ID&redirect_uri=REDIRECT_URI&response_type=code");  
                webBrowser.Url = uri;  
                webBrowserForm.Controls.Add(webBrowser);  
                string code = null;  
                WebBrowserDocumentCompletedEventHandler documentCompletedHandler = (s, e1) =>  
                {  
                     string[] parts = webBrowser.Url.Query.Split(new char[] { '?', '&' });  
                     foreach (string part in parts)  
                     {  
                          if (part.StartsWith("code="))  
                          {  
                               code = part.Split('=')[1];  
                               webBrowserForm.Close();  
                          }  
                          else if (part.StartsWith("error="))  
                          {  
                               Debug.WriteLine("error");  
                          }  
                     }  
                };  
                webBrowser.DocumentCompleted += documentCompletedHandler;                 
                Application.Run(webBrowserForm);  
                webBrowser.DocumentCompleted -= documentCompletedHandler;  
                return code;  
           }  
which gets you the code included in the redirect URL.  The CLIENT_ID you need to get from Instagram, register an application with them and it will be provided, and the REDIRECT_URL must match, but the location is unimportant, as the web browser will close on completion it will not be seen.

There is still one problem, when using Fiddler to inspect the successful API calls from the test bed of Apigee the access token has three parts period delimited, the user_id, an unknown part, and the code returned from the OAuth authentication stage.  This is not well documented, and at this stage we are unable to generate the full access token.

All this highlights that the documentation of even well established APIs can at times be lacking for non-standard development paths.

Tuesday, 26 May 2015

Consuming Free APIs

My day job has recently thrown up a nice new challenge, producing a public API for the company's core system to enable simpler access to the ecosystem for 3rd party developers.  Historically 3rd party developers have been able to interface with and customize the system by consuming an SDK, but that was in the days of an on premise installation of the system.  We are pushing towards a multi-tennented, single instance, cloud based offering, and this cannot be extended or customized by the use of an SDK.  How could the multiple customers using the system each have their own extensions and customizations, purchased from 3rd parties, when there is only a single instance of the service centrally hosted on the cloud?

As a result the decision was made to replace the SDK with a RESTful API which allows access to the data of the individual customer and thus multiple extensions could all coexist, simply accessing the system via the new API.

This new track of development got me thinking.  I had produced APIs in the past, non-web based APIs (using such legacy technologies as COM+ among others), and web based APIs, one using a SignalR interface performing a large amount of business logic and data persistence.  However, none of these APIs had been produced entirely as a public API for consumption by 3rd parties.  All had a target consuming application, be it a cooperating 3rd party (with the potential to resale to other customers) or an internal partner dev team in the case of the SignalR interfaced system.  These systems were developed as APIs as a way of separating concerns,  allowing the multiple teams to develop in isolation, concentrating on their own area of expertise, with an agreed interface.  The key advantage of an API.

So having not really thought in depth about the creation of an API for general public use, equally I had not thought about this from the other side, consuming a publicly accessible API produced by someone else who I have no professional relationship with.



I have for some time been toying with the idea of producing a mobile app, but have not yet come up with an idea of something I want to create for the public to see.  I am of the opinion that creating something rather than nothing is a better idea, be it imperfect or not, I should simply create something and release it, and that is what I will do.  This is a way of thinking that is a little new to me, previously I was of the opinion that if it was not a 'great' idea there was no point in trying it.  But John Sonmez's blog post on the subject of simply creating something regularly is better than sitting on an idea until a perfect idea comes along.

So where do I start? Years ago I thought that indie games would be the best approach, I still think this would be a good thing to do, potentially very profitable, but an approach that could take a lot of effort to get something out.  And more importantly something I have little experience in, particularly the design and aesthitics aspect.  If I want to create something fairly rapidly in my spare time (which is scarce) I need to find a short-cut.  This is where the use of APIs come in.

There are thousands of publicly accessible, freely accessible, APIS out there.  Web sites like https://www.mashape.com http://www.programmableweb.com/ and http://freeapi.net/ offer a range of APIs, and are just the tip of the iceberg for what is available.  Creating a mobile application that consumes one of these APIs should be fairly straightforward, and as I have no great ideas of my own, my plan is to look through the lists of readily available APIs, and try to come up with an idea of how to present its functionality in a new way within a mobile app.

This approach has the massive advantage that someone else has done all the hard work of creating the API, hopefully thought through all the logic and created something that will do what is claims to do.  So if I can present a good human interface to use such an API, then I will have a useful app, and something that the public may want to consume.

Monday, 18 May 2015

The Wrong Path?

I read an email post today regarding the benefits of doing something rather than nothing, even if it might be the wrong thing to do.  This post by John Sonmez uses the quote
Sometimes you can only discover the right path by taking the wrong path 50 times.
This struck a real chord with me on a number of levels.  I have a current desire to enter into the world of mobile apps, the problem here being that I need and idea for an app to create.  I have written at some length here about the use of genetic algorithms to solve difficult mathematical problems, an approach based entirely in exploring the 'wrong path'.  I started a SaaS project with some friends a while back that has stalled somewhat due to a number of reasons, not least a reluctance to expose the beta version to wider public scrutiny. And lastly, in my day job I am working on a public API to the companies core system.  This is a new venture, and we are finding ourselves galloping down the wrong path on a daily basis, and returning to try the left fork rather than the right very quickly.  It has been a very refreshing approach from a software house that is steeped in protracted design and analysis phases before development.  An approach that is producing some considerable discomfort in a majority of the dev team (testers included), but one that I am relishing.



A big part of the problem is that we are attempting to created an automated testing suite as we develop, a great idea in itself, however the testers who are defining the content of this test suite are struggling a little with the idea of the expectations of the system changing faster than the tests are being produced.

For example, when requesting the list of entities related to another entity using an http get request with the url /entityA/{id}/entityB to get all the instances of entityB which has an entityA with id = {id}, if there is no entityA with id ={id} the testers argue that the system should return an error, most likely a 404 not found error.  This assumption was backed up with the rationale that if the request /entityA/{id} was used then a 404 error would be returned.  In contrast if entityA with id = {id} does exist, but has no entityB children, then an empty list would be valid.

A compromise was made that for the example we looked at, it was not valid to have an entityA with no entityB children, so if an empty list was retrieved from the database, then a 404 would be returned.  The developers were happy, as we could simply return a 404 for 'routed' get all requests resulting in a zero length list, successful blanks lists for 'direct' get all requests (/entityB/) and 404 for a direct get single (/entityA/{id}) with no match.

This meant that no validation of {id} was required, so a single database query will give all the information we need.

A couple of days later this became more complex.  An example was found with entityC and entityD, where entityC could exist with no children of type entityD, so the request /entityC/{id}/entityD could produce no results for two situations, entityC with id = {id} may not exist, or it may exist with no children.  The code as it stood returned a successful result with a blank list in both circumstances.  The testers did not like this as we had made the decision to go this way with analysis of only the A,B case, without consideration of the C,D case.

The final decision of what we will do is yet to be made, however we have a system that supports the query of both A,B and C,D working right now.  If we had started analyzing all the possible combinations of entity types in the system (100s) before we decided on a way to go for any routed queries we would have nothing for a good month.

We may have taken the wrong path, the testers would say that we have, we may have chosen the right one, the architects are arguing that side based on reducing database hits to a minimum (checking if entityA with id = {id} would require another db call after we know that the full query returns nothing, and for more complex queries with multiple routings this may amount to a significant increase in DB traffic and therefore cost as its a cloud based DB).

Whether this will get us to Eldorado in the end only time will tell.  We are due to engage early with a consumer of the API, and their preference on what we return will be the deciding factor, but at least we are in a position very early to show them something.

Wednesday, 22 April 2015

Never Lose an Idea

It may be a bit off topic for this blog, but its something that has plagued me so I thought I would document some things I have done to help myself.  I, like many a dev, am forever looking for the next great idea that I cant develop to potentially create a secondary source of income, and the dream is this secondary source become great enough to stop the primary source of working for someone else.

The problem is getting the idea.  Well we all have ideas, the efficient programmer blog talks about the fact we can all come up with ideas, but generally it does not happen when we sit down and try to have an idea.  This was the case for me last week, I had an idea for an SaaS system whilst in the car on the commute home.  Fortunately I was able to explore the idea, and it turned out not to be a particularly viable idea as someone already has a system offering the same service very well at a cheap price.  But what if I had not been in a position to explore the idea straight away?  What if I had been too busy to use the idea right away?  What if I had been on a roll and came up with too many ideas to deal with at once.  How could I keep track of them all?  How could I organise them for a later date?  Well the efficient programmer blog offered a good solution to this, and I will look to document my progress through implementing this for myself.

The system they proposed is the use of Trello as a repository for the ideas, and the use of and orchestration system to forward emails into the Trello board.  I am looking to implement a simialr system, using only Trello with no orchestration software to set up.

I created 3 boards in Trello, one to contain ideas for content matter (blogs, videos etc) one for mobile app ideas, and one for SaaS system ideas.  On each of these boards I added 5 lists: Ideas; To Develop; In Progress; Done; and Rejected.  The plan being that new ideas go into the first.  Once I have done a little background investigation and decided its a feasible idea it will move to the second.  When work begins for real, into the third, and when something goes live the card goes to Done.  Anything noted as  not feasible goes into the Rejected list.



This gives me the ability to organise and file the ideas, categorising between content/app/saas and progressing the ideas.  It also give the ability to track rejected ideas so I dont waste time coming up with the same idea every 2 months.  The advantage the efficient programmer approach was the ability to email in ideas, so as long as you have access to a device with email capability you can record ideas.  I also added this ability, but my approach does not require any other software, it uses inbuilt Trello functionality.

To hook up the ability to email a card into Trello, go to the Menu and select 'Email to Board Settings'.  This will bring up a panel with an email address, and the options of which list to add the cards generated to.  You can also choose to add cards to the bottom or the top of the list.  I selected 'Ideas' list and 'Bottom', so the oldest ideas will always be at the top for sanitising.


The approach used by efficient programmer, on the face of it, allows you to have your different categories (content/app/SaaS) on a single Trello board as multiple lists, where as my approach requires one board per category with the multiple lists used as a progress status indicator.

Using the email address for a test produces a card:
and the card contains the subject of the email as its title and the body of the email as the card description:

The one problem I have found is the inability to configure the email address associated with the board, but other than that, this approach gives a solution to the problem of recording and organising ideas that are generated at inopportune moments with little set up.  I set up this entire system in under  20 minutes including taking screen shots along the way.

Finally, if you found this blog useful and interesting please follow me and share this with anyone and everyone.

Monday, 20 April 2015

How Automated Integration Testing Can Breakdown


Automated integration testing to me means being able to run tests against a system that test the way some or all of the parts integrate to make the whole, fulfilling the requirements of the system, in a way that can be repeated without manual intervention.  This may be as part of a continuous integration type build process, or simply a human initiating the test run.
In order to perform this task, the tests must be able to be run in any order, as any subset of the entire suite, and in parallel with no implications on the result of any test.

For the purposes of automated integration testing, I am an advocate of a BDD style approach using a test definition syntax such as that of cucumber, and I will explore a case where I have worked on this approach to integration testing only to be bitten by some inadequacies of the way it was implemented.  I will begin by saying that the choice of using cucumber style tests for this project was not necessarily wrong, but greater pre-planning of the approach was needed due to some specific technical issues.

System under Test

To give some background around the system that we were building, it consisted of a small tool that ran a series of stored procedures on a database, these output data in xml format.  This data was then transformed using a xslt, and the result was uploaded to an API endpoint.  The scope of the testing was the stored procedures and xslts, so ensuring that the data extracted and transformed conformed to the schema required by the API and that it was comprised of the data expected from the database being used.
The database itself was the back end of a large, mature system, the structure and population of which was the responsibility of another team.  Additionally the system that populates this database has a very thick and rich business layer, containing some very complex rules around the coupling within the data model which are not represented by the simple data integrity rules of the data model itself.  The data model is held only as a series of sql scripts (original creation plus a plethora of update scripts) and as such proves difficult to integrate into a new system.
During the day to day use of the existing, mature system the data is changed in various ways, and it is the job of the new system to periodically grab a subset of this data and upload it to the API using an incremental change model.  So the tests are required to validate that the stored procedures can run for a first time to get all data, and after some modification to get the incremental changes.

How Integration Tests were Set Up

The tests themselves followed the format:

 Given a blank database  
 And data in table "mytable"  
 |field1|field2|field3|  
 |value1|value2|value3|  
 When the data is obtained for upload  
 the output will be valid  
With the variation
 Given a blank database  
 And data in table "mytable"  
 |field1|field2|field3|  
 |value1|value2|value3|  
 And the data is obtained for upload  
 And data in table "mytable" is updated to   
 |field1|field2|field3|  
 |value1a|value2a|value3a|  
 the output will be valid  

for an incremental change.
This required that an unique blank database be created for that test (and every test in practice), the data be added to the database, the process of running the stored procedures and transforms be performed and the resulting data for upload be used in the validation step.  Creation of a blank database is simple enough, and can be done either by the scripts used by the main system, or as we chose, by creating an Entity Framework code-first model to represent the data model.  The first and most obvious problem you will have guessed is when the core data model changes, the EF model will need to be updated.  This problem was swept under the carpet as the convention is to not delete or modify anything in the data model, only to extend, but still it is a flaw in the approach taken for the testing.

The second and in my opinion most problematic area of this approach comes from data integrity.  The EF model contains the data integrity rules of the core data model and enforces them, however if for example in the test above 'mytable' contained a foreign key constrain from 'field1' to some other table ('mytable2'), this other table would also need populating with some data to maintain data integrity.  The data in 'mytable2' is not of interest to the test as it is not extracted by the stored procedures, so any data could be inserted so long as the constraints of the data model are met.  To this end a set of code was written to auto-populate any data required for integrity of the model.  This involved writing some backbone code to cover the general situation, and one class for each table
(for example
  class AlternativeItemHelper : EntityHelper<AlternativeItem>  
   {  
     public AlternativeItemHelper(S200DBBuilder dbbuilder)  
       : base(dbbuilder)  
     {  
       PrimaryKeyField = "AlternativeItemID";  
       ForeignKeys.Add(new ForeignKeySupport.ForeignKeyInfo<AlternativeItem, StockItem> { LocalField = "ItemID", ForeignField = "ItemID", Builder = dbbuilder });  
       ForeignKeys.Add(new ForeignKeySupport.ForeignKeyInfo<AlternativeItem, StockItem> { LocalField = "ItemAlternativeID", ForeignField = "ItemID", Builder = dbbuilder });  
     }  
   }  
will create data in table 'StockItem' to satisfy constraints on 'ItemID' and 'ItemAlternativeID' fields of the 'AlternativeItem' table.)
that was to be populated as part of a test scenario.  As you can imagine, if 'mytable2' contains a similar foreign key relation, then 3 tables need to be populated, and with the real data model, this number grows very large for some tables due to multiple constraints on each table.  In one test scenario the addition of one row of data to one table resulted in over 40 tables being populated.  This problem was not seen early, as the first half dozen tables populated did not have any such constraints, so the out of the box EF model did not need any additional data to be satisfied.
One advantage of the use of an EF model that I should highlight at this stage is the ability to set default values for all fields, this means that non-nullable fields can be given a value without it being defined in the test scenario (or when auto populating the tables for data integrity reasons alone).
If the data in the linked tables was of interest to the test scenario, the the tables could be populated in the pattern used in the example scenario, and so long as the ordering of the data definition was correct the integrity would be maintained with the defined data.

The third problem was the execution time for the tests.  Even the simplest of tests had a minimum execution time of over one minute.  This is dominated by the database creation step.  In itself this is not a show stopper if tests are to be run during quiet time, e.g. overnight, but if developers and testers want to run tests in real time, and multiple tests are of interest, this meant a significant wait for results.

Summary

The biggest problem with the approach taken was the time required to write a seemingly simple test.  The addition of data to a single table may require developer time to add in the data integrity rules, a large amount of tester time to defined what data needs to be in each table required by the integrity rules of the model and if the default values are sufficient.  Potentially dev time to define the default values for the additional tables.  In the end a suite of around 200 test was created which takes over 3 hours to run, but due to a lack of testing resource, full coverage was never achieved and manual testing was decided as the preferred approach by management.

Supporting Code Examples

The example entity helper for auto populating additional tables is derived from a generic class for help with all entity types, this takes to form
  public abstract class EntityHelper<T> : IEntityHelper<T>  
   where T : class, new()  
   {  
     protected S200DBBuilder dbbuilder;  
     protected DbSet<T> entityset;  
     protected long _id = 0;  
     protected string PrimaryKeyField { get; set; }  
     protected Lazy<GetterAndSetter> PkFieldProp;  
     public Lazy<List<PropertySetter>> PropertySetters { get; protected set; }  
     public EntityHelper(S200DBBuilder dbbuilder):this()  
     {  
       Initialize(dbbuilder);  
     }  
     protected EntityHelper()   
     {  
     }  
     public object GetRandomEntity()  
     {  
       return GetRandomEntityInternal();  
     }  
     protected T GetRandomEntityInternal()  
     {  
       T entity = new T();  
       //need to set all the properties to random values - and cache a way to create them faster  
       PropertySetters.Value.ForEach(ps => ps.SetRandomValue(entity));  
       return entity;  
     }  
     public virtual void Initialize(S200DBBuilder dbbuilder)  
     {  
       this.dbbuilder = dbbuilder;  
       this.entityset = dbbuilder.s200.Set<T>();  
       ForeignKeys = new List<IForeignKeyInfo<T>>();  
       PkFieldProp = new Lazy<GetterAndSetter>(() =>  
       {  
         var type = typeof(T);  
         var prop = type.GetProperty(PrimaryKeyField);  
         return new GetterAndSetter { Setter = prop.GetSetMethod(true), Getter = prop.GetGetMethod(true) };  
       });  
       //initialise the PropertySetters  
       PropertySetters = new Lazy<List<PropertySetter>>(() =>  
       {  
         var list = new List<PropertySetter>();  
         list.AddRange(typeof(T)  
                     .GetProperties()  
                     .Where(p => !p.Name.Equals("OpLock", StringComparison.OrdinalIgnoreCase))  
                     .Where(p => !(p.GetGetMethod().IsVirtual))  
                     .Select(p => PropertySetterFactory.Get(dbbuilder.s200, p, typeof(T)))  
                     );  
         return list;  
       });  
     }  
     protected virtual T AddForeignKeys(T ent)  
     {  
       UpdatePKIfDuplicate(ent);  
       ForeignKeys.ForEach(fk => CheckAndAddFK(fk, ent));  
       return ent;  
     }  
     protected void UpdatePKIfDuplicate(T ent)  
     {  
       //assumes all keys are longs  
       var pk = (long)PkFieldProp.Value.Getter.Invoke(ent, new object[] { });  
       var allData = entityset.AsEnumerable().Concat(entityset.Local);  
       var X = allData.Count();  
       while (allData.Where(e => PkFieldProp.Value.Getter.Invoke(e, new object[] { }).Equals(pk)).Count() >0)  
       {  
         pk++;  
         PkFieldProp.Value.Setter.Invoke(ent, new object[] {pk });  
       }  
     }  
     protected T ReplicateForeignKeys(T newent, T oldent)  
     {  
       ForeignKeys.ForEach(fk => fk.CopyFromOldEntToNew(oldent, newent));  
       return newent;  
     }  
     public void AddData(IEnumerable<T> enumerable)  
     {  
       entityset.AddRange(enumerable.Select(ent => AddForeignKeys(ent)));  
     }  
     public void UpdateData(IEnumerable<T> enumerable)  
     {  
       foreach (var newent in enumerable)  
       {  
         var oldent = GetCorrespondingEntityFromStore(newent);  
         UpdateEntityWithNewData(oldent, newent);  
         dbbuilder.s200.Entry(oldent).State = EntityState.Modified;          
       }  
     }  
     protected void UpdateEntityWithNewData(T oldent, T newent)  
     {  
       foreach (var prop in typeof(T).GetProperties())  
       {  
         //todo - change this line to be a generic check on the prop being a primary key field  
         if (prop.Name.Equals("SYSCompanyID")) continue;  
         var newval = prop.GetGetMethod().Invoke(newent, new object[] { });  
         // Not sure if this is the correct place to do this, will check with Mike W  
         if (newval != null)  
         {  
           var shouldUpdateChecker = UpdateCheckers.Get(prop.PropertyType);  
           shouldUpdateChecker.Update(newval, oldent, prop.GetSetMethod());  
         }  
       }  
     }  
     public void Delete(T entity)  
     {  
       var storeentity = GetCorrespondingEntityFromStore(entity);  
       DeleteEntity(entity);  
     }  
     private void DeleteEntity(T entity)  
     {  
       entityset.Remove(entity);  
     }  
     public void Delete(long id)  
     {  
       var entity = GetById(id);  
       DeleteEntity(entity);  
     }  
     public void DeleteAll()  
     {  
       var all = entityset.ToList();  
       entityset.RemoveRange(all);  
     }  
     public long AddSingle(T entity)  
     {        
       var id = Interlocked.Increment(ref _id);  
       SetId(entity, id);  
       AddData(new[] { entity });  
       return id;  
     }  
     protected void SetId(T entity, object id) { PkFieldProp.Value.Setter.Invoke(entity, new[] { id }); }  
     protected T GetCorrespondingEntityFromStore(T newent) { return GetById(PkFieldProp.Value.Getter.Invoke(newent, new object[] { })); }  
     protected T GetById(object id) { return entityset.AsEnumerable().Single(ent => PkFieldProp.Value.Getter.Invoke(ent, new object[] { }).Equals(id)); }  
     public void UpdateAllEntities(Action<T> act)  
     {  
       entityset.ToList().ForEach(act);  
     }  
     public void UpdateEntity(int id, Action<T> act)  
     {  
       var entity = GetById(id);  
       act(entity);  
     }  
     public IEnumerable GetDataFromTable(Table table)  
     {  
       return table.CreateSet<T>();  
     }  
     public void AddData(IEnumerable enumerable)  
     {  
       var data = enumerable.Cast<T>();  
       AddData(data);  
     }  
     public void UpdateData(IEnumerable enumerable)  
     {  
       var data = enumerable.Cast<T>();  
       UpdateData(data);  
     }  
     protected List<IForeignKeyInfo<T>> ForeignKeys { get; set; }  
     protected void CheckAndAddFK(IForeignKeyInfo<T> fk, T ent)  
     {  
       //first get the value on the entitity and check if it exists in the model already  
       fk.CreateDefaultFKEntityAndSetRelation(ent);  
     }  
     public void CreateDefaultEntity(out long fkID)  
     {  
       var entity = new T();  
       fkID = AddSingle(entity);  
     }  
     public void CreateDefaultEntityWithID(long fkID)  
     {  
       var entity = new T();  
       SetId(entity, fkID);  
       AddData(new[] { entity });  
       //the _id field needs to be greater than the id used here, so   
       fkID++;  
       if (fkID >= _id)  
         Interlocked.Exchange(ref _id, fkID);  
     }  

To create default values for any entity we created a partial class to extend the entity class that has one method:
  public partial class BinItem  
   {  
     partial void OnCreating()  
     {  
       DateTimeCreated = DateTime.Now;  
       BinName = string.Empty;  
       SpareText1 = string.Empty;  
       SpareText2 = string.Empty;  
       SpareText3 = string.Empty;  
     }  
   }  
this method being called as part of the constructor of the entity class
  [Table("BinItem")]  
   public partial class BinItem  
   {  
     public BinItem()  
     {  
          ...
          OnCreating();  
     }  
     partial void OnCreating();  
     ...  
   }  
The entity helpers rely upon foreign key information to know about the constraints on the table, these are supported by a class
  public class ForeignKeyInfo<T, T2> : IForeignKeyInfo<T>  
     where T : class,new()  
     where T2 : class,new()  
   {  
     public ForeignKeyInfo()  
     {  
       BuildIfNotExists = true;  
       LocalFieldSetter = new Lazy<MethodInfo>(() =>  
         {  
           var type = typeof(T);  
           var prop = type.GetProperty(LocalField);  
           if (prop == null)  
             prop = type.GetProperty(LocalField+"s");  
           return prop.GetSetMethod(true);  
         });        
       LocalFieldGetter = new Lazy<MethodInfo>(() =>  
       {  
         var type = typeof(T);  
         var prop = type.GetProperty(LocalField);  
         if (prop == null)    
           prop = type.GetProperty(LocalField + "s");  
         return prop.GetGetMethod(true);  
       });  
       ForeignFieldGetter = new Lazy<MethodInfo>(() =>  
       {  
         var type = typeof(T2);  
         var prop = type.GetProperty(ForeignField);  
         if (prop == null)  
           prop = type.GetProperty(ForeignField + "s");  
         return prop.GetGetMethod(true);  
       });  
       ForeignTableGetter = new Lazy<MethodInfo>(()=>   
         {  
           var type = typeof(S200DataContext);  
           var prop = type.GetProperty(typeof(T2).Name);  
           if (prop == null)  
           {  
             prop = type.GetProperty(typeof(T2).Name+"s");  
             if (prop == null && typeof(T2).Name.EndsWith("y"))  
             {    
               var currentName = typeof(T2).Name;;  
               prop = type.GetProperty(currentName.Substring(0,currentName.Length-1) + "ies");  
             }  
             if (prop == null && typeof(T2).Name.EndsWith("eau"))  
             {  
               prop = type.GetProperty(typeof(T2).Name + "x");  
             }  
             if (prop == null && typeof(T2).Name.EndsWith("s"))  
             {  
               prop = type.GetProperty(typeof(T2).Name + "es");  
             }  
           }  
           var getter = prop.GetGetMethod(true);  
           return getter;  
         });  
     }  
     public string LocalField { get; set; }  
     public S200DBBuilder Builder { get; set; }  
     public string ForeignField { get; set; }  
     public bool DoesFKExist(T ent)  
     {  
       //check the foeign table to see is an entry exists which matches the ent  
       var lf = LocalFieldGetter.Value.Invoke(ent, new object[] { });        
       return GetForeignEnts(lf).Count()> 0;  
     }  
     public void CreateDefaultFKEntityAndSetRelation(T ent)  
     {  
       if (DoesFKExist(ent))  
       {  
         return;  
       }  
       var lf = LocalFieldGetter.Value.Invoke(ent, new object[] { });  
       if (lf == null)  
       {  
         if (BuildIfNotExists)  
         {  
           //the test did not define the FK ID to use, so just default it to the next in the sequence  
           long fkID = 0;  
           Builder.WithDefaultEntity(typeof(T2), out fkID);  
           //now set the FK relation  
           LocalFieldSetter.Value.Invoke(ent, new object[] { fkID });  
         }  
       }  
       else  
       {  
         //create the FK entity using the id that has been passed in  
         Builder.WithDefaultEntityWithID(typeof(T2), (long)lf);  
       }  
     }  
     private T2 GetForieignEnt(object fkID)  
     {  
       return GetForeignEnts(fkID).FirstOrDefault();  
     }  
     private IEnumerable<T2> GetForeignEnts(object fkID)  
     {  
       var castData = (DbSet<T2>)(ForeignTableGetter.Value.Invoke(Builder.s200, new object[] { }));  
       var allData = castData.AsEnumerable().Concat(castData.Local);  
       var fes = allData.Where(fe => ForeignFieldGetter.Value.Invoke(fe, new object[] { }).Equals(fkID));  
       return fes;  
     }  
     private Lazy<MethodInfo> LocalFieldSetter;  
     private Lazy<MethodInfo> LocalFieldGetter;  
     private Lazy<MethodInfo> ForeignFieldGetter;  
     private Lazy<MethodInfo> ForeignTableGetter;  
     public T CopyFromOldEntToNew(T oldent, T newent)  
     {  
       if (DoesFKExist(newent))  
       {  
         return newent;  
       }  
       var value = LocalFieldGetter.Value.Invoke(oldent, new object[] { });  
       LocalFieldSetter.Value.Invoke(newent, new object[] { value });  
       return newent;  
     }  
     public bool BuildIfNotExists { get; set; }  
   }  

Monday, 13 April 2015

BDD to Break Communication Barriers

In my years of software development the one underlying theme that has caused the most problems when it comes to delivering quality software that fits the needs of the customer is misunderstanding and miscommunication of requirements.  This is not a problem that is due to any individual doing their job badly, not down to poor communication skills, not down to a lack of effort of diligence by anyone.  It is simply due to the number of times the same information is written, spoken, heard, interpreted and misinterpreted.  Its a tale of Chinese whispers, a term that is probably not politically correct in this day and age, but one that fits the bill.  In an idealised scenario:

  1. Customer dreams up something they want a system to do
  2. Business analyst and sales team talk to them to nail down chargeable and isolated items (stories if you like)
  3. Business analysts document the stories for the technical team(s)
  4. Architects and developers read these, interpret and develop a solution
  5. Testers read and interpret the BA's docs and create a test suite
In this scenario the requirements are 'written' 2 times by 2 people, 'read' 3 times by 3 people but reinterpreted at each step, so what the 3 layers of this process (customer; BA and sales; dev and test) understand the system needing to do can be vastly different, especially in the detail.  A situation can occur where the BA slightly misunderstands the customers needs, then does not communicate their understanding thoroughly.  The dev and test teams pick this up and interpret in 2 further subtly different ways.  All three layers of the process think they understand everything fully, the system 'works' in so much as it performs a function, all of the tests are passed so the system is delivered or demoed to the customer, but the system does not do what the customer really wanted it to do.  Where did the failure happen?  Everyone in the process has performed their own individual part successfully, and something has been built that everyone will be to some extent satisfied with, but its not what the customer wants.

What is missing here is a single item that all players can refer to and agree upon as fully defining the requirements.  Conventional wisdom would say that the document produced by the BA is that item, and ideally that would work, one document, all parties agree to its contents, all code is built to its specifications, all tests validate these specifications and all the desires of the customer are met.  In practise the 3 ways this document is interpreted mean the one single document conveys three messages.  So how can we overcome this?

A document that can be 'read' by a machine, interpreted only in one way so code to make the system work and tests to validate this have the same interpretation, and so long as this can be read easily in a non-technical manner by the customer and agreed upon the loop should be closed.  Customer agrees to the specification when written into the document, the tests prove this happens and works, and the code is written to satisfy the tests.  No ambiguity remains.

This forms something of the basis for the concept of Behaviour Driven Development (BDD) where the desired behaviour of the system is defined and forms the specifications, the tests and drives the development to meet these, akin to test driven development but where overall behaviour of the system is the driver, not the technical test specifications, which in general do not over the overall system, but isolated units of the system.  The core advantage of BDD is that the behaviour specification is written in a language that can both be read by non-technical personnel (customers) and interpreted by the computer without translation.

The syntax of defining specifications is a well established one, and one that has many flavors (GWT, AAA etc).  A DSL was developed to encapsulate the given when then formulation of requirements and has been given the name gherkin.  For the purposes of using this within a .net project a tool called SpecFlow is the one I currently choose to use, although in the past I have used cucumber and had to write Ruby code to access the actual code of the .Net system, the advantage of Specflow is that all the specifications, code to access the system and the system under test itself can exist in one place, and written in one development language.

I am not writing a post on how to perform BDD here, I am looking to highlight the advantages of a BDD tool like Specflow to the development process, and specifically the communication of detailed technical ideas between different groups and disciplines within the process without ambiguity of interpretation creeping in.  That said, a simple test specification taken from the cucumber website provides a good place to start in terms of understanding how this fits into the dev process.

 Feature: CalculatorAddition  
      In order to avoid silly mistakes  
      As a math idiot  
      I want to be told the sum of two numbers  
 Scenario: Add two numbers  
      Given I have entered 50 into the calculator  
      And I have entered 70 into the calculator  
      When I press add  
      Then the result should be 120 on the screen  

This specification is the example one that is provided when you add a new spec to a unit test project in visual studio using the Specflow add in, but provides a good point to explore the way gherkin solves the problem of interpretation of requirements.

This specification is very easy to understand, a non-technical person e.g. the customer, could read this and sign-off that this details what they need the system to be able to do.  The question is how does this satisfy the needs of the testers to validate that the system does what is described.  Well that is the beauty of the cucumber/specflow system.  This series of definitions constitute a test as well as a requirement specification. The specflow framework executes a piece of code for each of these lines, the code in question being hooked into by a regular expression match of the definition itself against an attribute on the code method.  And the convention is that the 'Then' definition will validate the outcome of the action against the expectations (do an assert if you prefer).  The code that needs to be written to hook this into the production system is very straightforward
 [Binding]  
   public class CalculatorAdditionSteps  
   {  
     Calculator calc = new Calculator();  
     [Given(@"I have entered (.*) into the calculator")]  
     public void GivenIHaveEnteredIntoTheCalculator(int value)  
     {  
       calc.InputValue(value);  
     }  
     [When(@"I press add")]  
     public void WhenIPressAdd()  
     {  
       calc.DoAddition();  
     }  
     [Then(@"the result should be (.*) on the screen")]  
     public void ThenTheResultShouldBeOnTheScreen(int expectedResult)  
     {  
       Assert.AreEqual(expectedResult, calc.Result);  
     }  
   }  
and this form the basis of a simple unit test.  As you can imagine detailing a fully functional production system will involve significantly more code to be written, but with the advantage that if the specifications drive the process, the tests come for free, and the architecture and code design is driven towards a clear and easily instantiated structure.  Minimal coupling and dependencies make the production and maintenance of the 'hook' code here significantly easier.

When performing this as a BDD process a simple Calculator class will satisfy the needs of the test as far as being able to build
 class Calculator  
   {  
     internal void InputValue(int value)  
     {  
       throw new NotImplementedException();  
     }  
     internal void DoAddition()  
     {  
       throw new NotImplementedException();  
     }  
     public object Result { get; set; }  
   }  
And when run the test will fail.  It is also possible to work with only the specification, before the 'hook' code is written, at which stage running the test will give an inconclusive result, highlighting that the specification has been detailed, but that no work has been performed to hook this into validate the system, potentially meaning the system has not had the new functionality added.

There are shortcomings to this approach, but as a way of removing the element of Chinese whispers from the development process it goes a long way to solving the problem.

I will showcase a situation where this approach proved problematic in a future blog, a situation where it did test the system successfully but where the overhead of creating and maintaining the specifications and the hook code outweighed the advantages provided.

Tuesday, 7 April 2015

Visualisation of evolution

Even though we know that the end result is what we are after, and speed is one of the most important factors, it would be nice when assessing different mutation and breeding operator combinations, and the affect of the applied fitness function to track the evolution of the population, or at least that of the fittest member(s) graphically to quickly see, and convey to intetrsted parties what is happening.
To this end I will explore the possibility of hooking a visualisation interface into the algorithm with a minimum of code churn, and minimal speed impact.
The approach I will take is to use a consumer object to handle the generation complete event. The responsibility of not blocking the processing thread will fall to this consumer, and all details of the rendering of the interim results will be totally hidden to the gentic algorithm itself.  This approach will mean that if a web enabled front end, or simply a different desktop UI, were needed you merely need to construct this and inject it.

I have chosen to use a WPF Gui as I have experience of automatically updating graphs in this medium. Another technology may be better suited to your skill set. The WPFToolkit offers a very good charting control, which can plot data the is updated in real time very easily with data binding.  I will not go into the details of the WPF application itself, or the structure of such an application, however the details of displaying the evolution are what we are interested in, so that is what I will focus on. But I will say that my chosen architecture employed an MVVM pattern

 The code for each chart is very simple, with the UI layer being simply

  xmlns:chartingToolkit="clr-namespace:System.Windows.Controls.DataVisualization.Charting;assembly=System.Windows.Controls.DataVisualization.Toolkit"  
 <chartingToolkit:Chart Title="Fitness" >  
       <chartingToolkit:Chart.Axes>  
         <chartingToolkit:LinearAxis Orientation="Y" ShowGridLines="False"  Minimum="{Binding MinFitness}" Maximum="{Binding MaxFitness}" />  
       </chartingToolkit:Chart.Axes>  
       <chartingToolkit:LineSeries DependentValuePath="Value" IndependentValuePath="Key" ItemsSource="{Binding GenerationFitness}" IsSelectionEnabled="True" />  
     </chartingToolkit:Chart>  

Where the MinFitness and MaxFitness are values calculated as results are generated to give a sensible range for the graph, and the GenerationFitness property is a collection holding the points to plot in the graph. This is bound to a view model that exposes the data without exposing the detail of the GA, and this takes the form:

 class ViewModel: NotifyingObject  
   {  
     private Model theData;  
     private double _minFitness;  
     private double _maxFitness;  
     private double varf = 0.01d;  
     private string _results;  
     private int _delay=0;  
     public ViewModel()  
     {  
       theData = new Model();  
       GenerationFitness = new ObservableCollection<KeyValuePair<int, double>>();  
       GenerationF = new ObservableCollection<KeyValuePair<int, double>>();  
       GenerationR = new ObservableCollection<KeyValuePair<int, double>>();  
       theData.NewGeneration += GotNewGeneration;  
       theData.FinalResults += GotFinalResults;  
       ResetFitness();  
     }  
     public int PopulationSize { get { return theData.PopulationSize; } set { theData.PopulationSize = value; } }  
     public int MaxGenerations { get { return theData.MaxGenerations; } set { theData.MaxGenerations= value; } }  
     public double MinFitness { get { return _minFitness; } set { _minFitness = value; OnPropertyChanged(); } }  
     public double MaxFitness { get { return _maxFitness; } set { _maxFitness = value; OnPropertyChanged(); } }  
     public string Results { get { return _results; }set{_results = value; OnPropertyChanged();} }  
     public int Delay { get { return _delay; } set { _delay = value; OnPropertyChanged(); } }  
     public ObservableCollection<KeyValuePair<int, double>> GenerationFitness { get; set; }  
     public ObservableCollection<KeyValuePair<int, double>> GenerationR { get; set; }  
     public ObservableCollection<KeyValuePair<int, double>> GenerationF { get; set; }  
     public ICommand Stop { get { return new RelayUICommand("Stop", (p) => theData.Stop(), (p) => theData.IsRunning); } }  
     public ICommand Start  
     {  
       get  
       {  
         return new RelayUICommand("Start", (p) =>  
         {  
           ClearAll();  
           theData.Start();  
         }  
         , (p) => !theData.IsRunning);  
       }  
     }  
     public ICommand Clear  
     {  
       get  
       {  
         return new RelayUICommand("Clear", (p) =>  
         {  
           ClearAll();  
         }  
           , (p) => !theData.IsRunning);  
       }  
     }  
     private void ResetFitness()  
     {  
       _minFitness = 0d;  
       _maxFitness = 1d;  
     }  
     private void GotNewGeneration(object sender, GenerationEventArgs e)  
     {  
       Application.Current.Dispatcher.Invoke(() =>  
         {   
           GenerationFitness.Add(new KeyValuePair<int, double>(e.Generation, e.Fitness));  
           if (e.Generation ==1)  
           {  
             MaxFitness = e.Fitness * (1d + varf);   
             MinFitness = e.Fitness * (1d-varf);  
           }  
           MaxFitness = Math.Max(MaxFitness, e.Fitness *(1d + varf));  
           MinFitness = Math.Min(MinFitness, e.Fitness * (1d - varf));  
           GenerationF.Add(new KeyValuePair<int, double>(e.Generation, e.F));  
           GenerationR.Add(new KeyValuePair<int, double>(e.Generation, e.R));  
           Debug.WriteLine(String.Format("Generation: {0}, Fitness: {1},R: {2}, F: {3}", e.Generation, e.Fitness, e.R, e.F));  
         });  
       Thread.Sleep(Delay );  
     }  
     private void GotFinalResults(object sender, FinalResultsEventArgs e)  
     {  
       Results = String.Format("R: {0}{1}F: {2}{1}Fitness: {3}{1}{1}From Values:{1}{4}", e.R, Environment.NewLine, e.F, e.Fitness, String.Join(Environment.NewLine, e.GeneValues));  
     }  
     private void ClearAll()  
     {  
       ResetFitness();  
       GenerationFitness.Clear();  
       GenerationR.Clear();  
       GenerationF.Clear();  
       Results = "";  
     }  
   }  

The model behind this does the work of instantiating the GA and it relays the results of each generation and the completion of the run in a meaningful manner to the view model:

 class Model: NotifyingObject  
   {  
     private double targetR = 0.95d;  
     private double targetF = 0.5d;  
     public double TargetR { get { return targetR; } set { targetR = value; OnPropertyChanged(); } }  
     public double TargetF { get { return targetF; } set { targetF = value; OnPropertyChanged(); } }  
     public EventHandler<DoubleEventArgs> NewFitnessValueArrived;  
     public EventHandler<GenerationEventArgs> NewGeneration;  
     public EventHandler<FinalResultsEventArgs> FinalResults;  
     private IGAProvider gaProvider;  
     private GeneticAlgorithm ga;  
     private int _maxGenerations;  
     private bool _isRunning;  
     public Model()  
     {  
       PopulationSize = 10000;  
       MaxGenerations = 100;  
       ////initialise the GA and hook up events  
       const double crossoverProbability = 0.02;  
       const double mutationProbability = 0.8;  
       gaProvider = GetGaProvider();  
       var crossover = new Crossover(crossoverProbability, true)  
       {  
         CrossoverType = CrossoverType.SinglePoint  
       };  
       //var crossover = new AveragingBreeder() { Enabled = true };  
       //inject the mutation algorithm  
       var mutation = new SwapMutate(mutationProbability);  
       //var mutation = new PairGeneMutatorWithFixedValueSum2(mutationProbability){Enabled = true};  
       //var mutation = new SingleGeneVaryDoubleValueMaintainingSum3(mutationProbability, 1d, 0.2d) { Enabled = true };  
       gaProvider.AddMutator(mutation);  
       gaProvider.AddBreeder(crossover);  
     }  
     public void Start()  
     {  
       const int elitismPercentage = 10;  
       var dh = new DoubleHelpers();  
       ga = gaProvider.GetGA(elitismPercentage, dh, PopulationSize);  
       GAF.GeneticAlgorithm.GenerationCompleteHandler generationComplete = ga_OnGenerationComplete;  
       GAF.GeneticAlgorithm.RunCompleteHandler runComplete = ga_OnRunComplete;  
       gaProvider.HookUpEvents(generationComplete, runComplete);  
       Task.Factory.StartNew(() => ga.Run(gaProvider.Terminate));  
       IsRunning = true;  
     }  
     public void Stop()  
     {  
       ga.Halt();  
       IsRunning = false;  
     }  
     private void ga_OnGenerationComplete(object sender, GaEventArgs e)  
     {  
       (gaProvider as DoubleChromosomes).CurrentGeneration = e.Generation;  
       var fittest = e.Population.GetTop(1)[0];  
       var r = (gaProvider as DoubleChromosomes).GetR(fittest.Genes.Select(x => x.RealValue));  
       var f = (gaProvider as DoubleChromosomes).GetF(fittest.Genes.Select(x => x.RealValue));  
       FireGenerationArrived(e.Generation, fittest.Fitness, r, f);  
     }  
      void ga_OnRunComplete(object sender, GaEventArgs e)  
     {  
       IsRunning = false;  
       var fittest = e.Population.GetTop(1)[0];  
       var r = (gaProvider as DoubleChromosomes).GetR(fittest.Genes.Select(x => x.RealValue));  
       var f = (gaProvider as DoubleChromosomes).GetF(fittest.Genes.Select(x => x.RealValue));        
       FireFinalResults(fittest.Genes.Select(x => x.RealValue), r, f, fittest.Fitness);  
     }  
     public IGAProvider GetGaProvider()  
     {  
       var gaProvider = new DoubleChromosomes(TargetF, TargetR){MaxGenerations = _maxGenerations};  
       return gaProvider;  
     }  
     private void FireGenerationArrived(int generation, double fitness, double r, double f)  
     {  
       var h = NewGeneration;  
       if (h == null)  
         return;  
       h(this, new GenerationEventArgs(generation, fitness, r, f));  
     }  
     private void FireFinalResults(IEnumerable<double> geneValues, double r, double f, double fitness)  
     {  
       var h = FinalResults;  
       if (h == null)  
         return;  
       h(this, new FinalResultsEventArgs(geneValues, r, f, fitness));  
     }  
     public bool IsRunning { get { return _isRunning; } set { _isRunning = value; OnPropertyChanged(); } }  
     public int PopulationSize { get; set; }  
     public int MaxGenerations {   
       get { return _maxGenerations; }   
       set { if (gaProvider != null) { gaProvider.MaxGenerations = value; } _maxGenerations = value; OnPropertyChanged(); }   
     }  
   }  

The NotifyingObject that both view model and model derive from is a useful class for implementing the INotifyPropertyChanged interface:

 public abstract class NotifyingObject : INotifyPropertyChanged  
   {  
     protected void OnPropertyChanged([CallerMemberName] string propertyName = "")  
     {  
       var eh = PropertyChanged;  
       if (eh != null)  
         eh(this, new PropertyChangedEventArgs(propertyName));  
     }  
     public event PropertyChangedEventHandler PropertyChanged;  
   }  

and the RelayUICoommand is an extension of the RelayCommand class which simply adds a Text property for the button caption

The app that I created allows the used to set the number of chromosomes in the population and the maximum number of generations to allow.  As the algorithm progresses, the values of R and F are plotted along with the fitness, all for the best solution of the current generation.  As a reminder the target values are R=0.95 and F=0.5, a fitness of 1 would be the ideal, and the out of the box SwapMutation and a single point Crossover operations are being employed.  Equally the fitness evaluation has not changed.

I have produced videos of the output (slowed down) using a population of just 4 chromosomes and 10000 chromosomes to compare the results.  Due to the very small population size of the first the algorithm actually evaluates to a worse solution than one found earlier in the evolution, but this highlights the plotting of the results.