Wednesday, May 11th, 2011
A Simple OO Refactoring Problem That Can Be Solved in under One Hour
When interviewing developers I like to use have a variant on the Rover problem (posed by Thoughtworks during their interview process) . The main difference is that the problem is specified as an existing class that needs refactoring. (essentially it’s a ‘Replace Conditional with Polymorphism’ refactor) and, as such, can be done in around 30-45 mins.
The problem is named “The Iffy Tractor” and you can download the full source for it here: TheIffyTractor or take a quick look at the main class here. It consists of a single class with a test that moves a tractor around a field. The class contains a few large ‘if’ statements that offer a variety of opportunities for the candidate to show off their OO skills. In addition the problem of rotating through the ordinals N/E/S/W to track direction presents a pleasant problem that has a variety of interesting solutions.
It helps to keep the scope of the problem small. It’s easy to get to grips with and start refactoring meaning that you can get something significant out of the candidate in under an hour. Its downside is the outcome is affected by weather you’ve used the refactor before and you can see that in the candidates approach (the test worked very differently in India to in the UK). I don’t find it matters that much though. You just have to interpret the results differently.
Like all tests it’s hard to say for certain how accurate or useful they really are. However the results from different candidates has been varied so they are at least a useful yardstick that provides some empirical measure of the candidate’s ability.
Wednesday, May 11th, 2011
This test is designed for the Business Analyst interview process. Like all tests it covers a core skill. In this case writing up a requirement/story. However, like the developer test (here), it probes something specific in a bit more detail; whether they think about the existing system, not just the requirement to be covered. The test is short and simple:
We have some developers who are working on an online calculator. Currently the site only supports add, subtract and equals. The users are looking to add multiplication to the application. Write a use case to describe this work which could be presented straight to the developers.
Obviously this provides an opportunity to show they can write. Does the BA know how to structure a requirement/story, can they articulate themselves etc? However the test goes a little deeper than that. The average candidates will jot down a story that describes how the new functionality (the multiplication button) should be developed. Better candidates will take into account existing functions in the system, describing how their new function should relate to the existing ones. For example they’ll specify how the multiplication button behaves when it is pressed after each of the other functions or numbers. Better ones still will refer to BODMAS etc.
Wednesday, March 23rd, 2011
The paper: Enabling Testing, Design and Refactoring Practices in Remote Locations was presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST) in Berlin in March 2011.
All credit to Amey Dhoke, Greg Gigon, Kuldeep Singh and Amit Chhajed for putting this paper together; documenting our ideas on promoting learning practices for Distributed Teams.
Thursday, March 3rd, 2011
PDF version available here: Enabling Testing, Design and Refactoring Practices in Remote Locations
This paper was presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST) in Berlin in March 2011.
All credit to Amey Dhoke, Greg Gigon, Kuldeep Singh and Amit Chhajed for putting this paper together, documenting our ideas on promoting learning practices for Distributed Teams.
Learning is a process of successive steps; we learn, we practice, the process cycles. It requires dedication from both teacher and student and it requires constant reinforcement . It is our contention that the best method for transferring skills like testing, refactoring and software design is through contextual learning: An ongoing program of enablement that sits hand in hand with day-to-day tasks. The code base forms the basis for contextual learning providing an information conduit that is location, language and culturally agnostic.
We discuss some of the problems faced by our team: A greenfield, test-driven project with twenty developers split between London and India. We discuss the methods employed to better enable testing and refactoring practices across this geographical divide. We found that different practices better-suited different phases of the project and different stages of learning within the team. As such these practices are mapped to the Shuhari learning model .
We conclude that there is no substitute for colocation. However we found that the team’s motivation is crucial to the success of learning endeavors. Intensive one-on-one practices worked well at the start of the project, when motivation was high and there was lots of ground to cover. As the project continued, the distribution of skills became more even and more collaborative practices were better suited to promoting learning.
Distributed teams are a common occurrence these days, particularly in the software development community. Along with issues of physical separation there are disparities in culture, language, time zone and skill sets .
Testing and refactoring are widely considered to be necessary skills for effective software development. As with other skills, they require learning and practice. The more practice a programmer has the more skilled they become. However, to maximise the programmer’s learning, instruction needs to be both directed and constantly enforced.
It is our belief that traditional learning methods break down in the context of teams separated by geographical boundaries. Classroom style teaching is rarely successful if delivered in isolation. Typical methods such as the sending of code snippets, documents and diagrams cannot substitute for collocated techniques like whiteboard sessions and pairing.
A. The Shuhari Learning Model
We found it useful to relate the team’s learning back to the Shuhari learning model. This model describes three stages of proficiency that students travel through. Shu describes the early stage of learning where students repeat a practice verbatim and in isolation. In the Ha stage their understanding of the practices they have learnt combine and they start to innovate. In the final stage, Ri, they become uninhibited by the constraints of what they have been taught.
B. The Effective Teaching of Refactoring and Testing Practices Requires Contextual Learning
Books and other typical classroom aids provide an effective theoretical basis for refactoring and testing. However the application of these concepts in the context of real-world systems is a far greater challenge. Bookwork is good for providing a conceptual understanding but this needs reinforcing in a real world context. The Strategy Pattern provides a good example . It is often taught using a sorting analogy where different algorithms represent different strategies that can be applied to the sorting of a list, each having the same functional output. For example bubble sort or quick sort algorithms. However, the real world application of this pattern can be quite different. For example replacing conditionals using a Strategy (or Policy) as in the Replace Conditional with Polymorphism refactor  and Conditional Decomposition . Understanding (or teaching) the application of such patterns is difficult in the absence of the contextual complexities of a of a real-world code.
C. The Importance of an Apprenticeship Model
Cognitive Science states that one of the most effective methods of learning is the apprenticeship model . In our context the student is taken through a solution in a collaborative manner, preferably with reference to a vocational situation. In teams following an Agile methodology this generally takes the form of Pair Programming . The pairing method can be used in an instructional way when experienced programmers are paired with less experienced students to lead them through the challenges of daily programming tasks. This facilitates both the constant feedback necessary to learn as well as the contextual basis of a real code base and real world problems.
D. Communication Issues
A lack of direct communication is one of the fundamental problems faced by distributed teams  making all forms of learning a challenge. Language, cultural and time zone differences all play their part. Language differences can be compounded further by a lack of familiarity with the more esoteric concepts and language present in most technical fields.
These factors can lead to unnecessary frustration on both sides, particularly in one-on-one sessions when communicating uses low bandwidth phone lines.
E. Collaboration aids Motivation and Higher Levels of Learning
We found motivation to be a key element in creating a learning culture but motivation is degraded if the team is not operating as a group of equals. It is easy to slip into a them-and-us mentality when teams separated geographically, and this can be worsened if practices focus on a single direction of learning rather than being collaborative. Collaborative practices lead to the feeling of one team, which encourages higher forms of learning (as in the ha, or innovating, stage of learning). They also can be beneficial, particularly when combined with code review: Whilst someone less experienced with Test Driven Development (TDD) may not be able to aide a more advanced one with her practice, he is still able to provide useful appraisal of her code.
Collaborative practices cannot substitute for focussed, individual teaching, which is always needed for accelerated training.
II. THE CONTEXT OF THE ODC TEAM
The Operational Data Cache (ODC) team, from which the experiences described in this paper originate, is a distributed team of around 20 developers spread 12:8 between India and the UK. The project was originally greenfield, is developed using Test Driven Development and has run for a year and a half. There were various disparities in expertise running in both directions across the geographical divide. As a result a variety of methods were explored to distribute this knowledge and skills from one location to another.
III. TENETS THAT DRIVE OUR LEARNING PRACTICES
The practices detailed in the following section are driven from three key tenets:
1. Learning practices must be collaborative and bi-directional. One way ‘instruction’ or ‘review’ will only prove fruitful for limited periods as it stifles both adoption and ownership of the practices being taught.
2. The code base should be considered the main tool for communicating practices and techniques. It is culturally neutral language and forms the key to contextual understanding. Techniques that are code focussed should be preferred to any form of theoretical discourse.
3. The distribution of skills will become increasingly homogenous as the team learns. The project dynamic also naturally changes on all projects . Learning practices should take these changes into account with the team using different practices at different times in the project lifecycle.
IV. STRUCTURAL AND PROCEDURAL SKILLS
We find it beneficial to segregate skills we needed to transfer into two types: Structural and Procedural. We use these to categorise practices.
1. Structural Skills are those required to create well-structured software, for example the application of patterns, the use of different types of unit tests and use of different types of test fixtures and builders. Methods for dealing with these problems were best taught with the code base as the primary medium for knowledge transfer. This being necessitated by the problems being contextual: solutions had to be taught in the context of real world problems.
2. Procedural Skills. These involve the processes a developer goes through writing software for example the compartmentalisation of a problem into individually committable steps, the selection of seams  for refactoring and the test, pass, refactor cycle . Such processes are extremely difficult to teach without colocation.
The practices here mostly focus on Structural Skills.
We found different practices more appropriate at different stages – focused mentoring followed by more collaborative and structured practices. The transition between phases is achieved by continuous learning.
V. DISTRIBUTED LEARNING TECHNIQUES USED TO AID TDD, REFACTORING AND DESIGN PRACTICES ON THE ODC PROJECT
A. Abridged Pairing
The aim of this practice was to transfer OO, testing and refactoring practices from one location to another in an apprenticeship-like manner.
In this practice, two developers, one from each location pair on design and development practices for 1 hour each day. The process is repeated daily to retain continuity. Developers switch pairs at the end of the story. Tasks are set for completion between sessions and questions that come up in the interval are answered. The sessions were facilitated through a desktop sharing tool  and telephone communication.
Pros: Targeted/Interactive: The targeted, interactive nature of this practice made it one of the most productive. Its one-on-one nature allows focus to be specific to the individuals involved and the context of the problem at hand. Pair rotation helps in development of uniform understanding across the team.
Continuity: The on-going sessions provide the feedback necessary to facilitate effective apprenticeship learning, with the pair following the evolution of a real software problem (typically a story) over a number of days.
Real-World Problems: The practice focuses exclusively on real world problems in the context of the project code base and the story being developed. This allows techniques like refactoring, testing and OO design to be discussed in the context of an evolving story. As such, it echoes many of the learning characteristics associated with traditional pair programming .
Cons: Suitable for Structural not Procedural Skill Transfer: Unlike traditional pairing, the short timescales involved in Abridged Pairing make it hard to teach procedural skills (such as the test-pass-refactor cycle ). The sessions tended to work better when structural issues, such as a mixing of concerns in a class, were addressed.
Read-Only Code Communication: Developers can discuss the problem in the context of the code base but the latency of the screen sharing software made concurrent editing impractical. This hinders the usefulness of the code base as a communication tool as changes can only be made by one member at a time.
Frustration: Both sides of the pairing found the practice difficult at times. The one to one nature of the practice makes it open to communication frustration. As such, whilst very successful for short bursts, it became harder to maintain the enthusiasm necessary for such intense sessions in longer-term.
B. Collaborative Refactoring
This practice is similar to a traditional code review . When a story is completed it is handed to a developer from the remote team for review. Instead of the traditional review process the reviewer actually refactors the code he is reviewing. At the end of the session the original programmer analyses the changes that were made and discusses them with the reviewer. The practice works best when it is targeted at a particular goal, for example describing the replace conditional with polymorphism refactor by applying it to a well known piece of code.
Pros: Real-World Problems: There are no contrived examples in this practice. One developer shows the other what they mean by making the changes and then letting the other view the differences. It provides the developer physical examples of the decisions that another developer make, just as they would in a collocated paring session.
Code as the Primary Communication Channel: Rather than describing what they mean over the phone one developer shows the other by changing the code and have the other review it. The code base becomes the communication medium through which the pattern is communicated rather than relying on the phone. For this reason, like Abridged Pairing, it is better suited to OO Design and refactoring (i.e. structural techniques rather than procedural ones like the test-pass-refactor cycle ).
Cons: Time Consuming: Like Abridged Pairing, collaborative Refactoring is quite time consuming and the context switch required from the teacher impacts their performance and Flow.
Unintentional Offense: Also common to Abridged Pairing, this practice has the potential to cause unintentional offense if refactorings are too broad in scope. This is minimised by keeping the session targeted to a specific goal.
C. Code Review Blitz
This practice involves one large, consolidated pairing session incorporating developers from both locations. The practice has three phases: An initial phase in which the stories under review are discussed. A second phase where the stories are reviewed and notes are taken (similar to a regular code review ). In the final stage reviewers provide feedback. Themes from the reviews are collected and addressed in additional one-on-one of group sessions.
Pros: The Group Provides motivation: We noticed a number of advantages when moving to group based practices. The group dynamic gives more inertia to the practice, encouraging participation.
Groups are more disarming: We found the group dynamic to be more collaborative and disarming than direct one-to-one feedback. This makes it more engaging for both teams increasing learning potential.
Collecting Broader Themes: The group nature of this practice makes it easier to collect broader themes to be focussed on separately either in group design sessions or more focussed practices.
Cons: Lack of Review Freshness: Probably the biggest drawback of this practice is the lack of freshness in the code under review, the oldest of which may be weeks old by the time the next Code Review Blitz comes around. This lack of freshness in the minds of the developers weakens the learning potential of the practice.
More Review than Instructional: The Code Review Blitz does not use the code base as a conduit for instruction. As such the learning potential it provides is limited.
D. Secondary Level Training: Driving Focussed Traning Sessions From Broader Review and Instructional Methods
The review and instructional processes described above provide a base level of training, focussing on skills and apprenticeship rather than the software development theory and broader practices. As such we found it to be beneficial to have a second, more focused level. This second level is driven from the Abridged Pairing, Collaborative Refactoring, the Code Review Blitz sessions or even just the general wonderings of the code base that occur during software development. The team looks for overarching themes or problems that require longer, more focussed training or discussion and addresses them specifically. For example it was observed that testing practices were often leading to too much coupling between test classes and implementations. This leads to collocated training sessions on TDD. Other sessions were conducted over phone / screen sharing / video conferencing (The phone / screen sharing generally being considered the most productive).
E. Developer Rotations
The most brute force approach that we tried: developers swap location for two-week periods.
This provides the opportunity to learn development practices first hand, pair etc. It also helps develop a common set of development practices and improves the feeling of collective ownership . This is the only effective way we found for transferring procedural skills like the test-pass-refactor cycle . The downside of this practice is the travel expense.
F. Utilising Practice Champions
We found that where a practice needed transferring, it was often beneficial to focus on one team member who displayed proficiency and who could then inculcate these practices in an on going basis. This technique is particularly useful when transferring procedural rather than structural skills requiring a more apprentice-like learning method.
G. Building a Raport
It is beneficial to include a preparatory phase at the start of the project as well as preceding learning exercises in which the team get honest and open communication. Video conference sessions without specific agendas work well simply as a tool to help build relationships. Developer rotations work even better. We found that one-on-one practices were noticeably easier for both parties where a bond had been formed between the participants earlier in the project.
H. Remote Pair Programming
We did not try this practice due to insufficient tool support, communication issues and timing differences. We make note of it only because of its reported success in other contexts .
This talk presents a case study from the ODC team that explores the challenges faced in transferring skills across a geographical boundary.
Our premise is that distributed communication is a skill distinct from its co-located counterpart. A familiarity with collocated communication can blind us to its ineffectiveness in a geographically and culturally dispersed context. As such we suggest practices that favour the use of the code base as a conduit rather than traditional, verbal methods.
We found that the team needed a range of practices that could be switched in and out at different times in the project’s evolution. Learning practices cause knowledge differentials within the team to subside and this changes the practices that are needed. Mentoring style practices are intense and effective but difficult to maintain in longer term and as such, they are needed less as the team matures. This provides the most benefit at the start of the project when practices sit in the Shu (repetition) phase of learning in Figure 1 . However the prolonged use of mentoring-styled practices can inhibit growth by discouraging equality across the team. Switching to collaborative practices helped to foster the move to the Ha (innovative) stage of learning. Composite practices like the Code Review Blitz provide the benefits of personal direction as well as group feedback. By applying these practices at different phases in project lifecycle, we have achieved a more motivated and cohesive team. We are in the process of extending this work further to explore additional collaborative practices that focus on greater interaction between team members.
 Lotlarsky, J & Oshri,l, (2005) Social ties, knowledge shareing and successful collaboration in globally distribted, system development projects. European Journal of Informantion Systems, 14, pp. 37-48
 B. Ramesh, L. Cao, K. Mohan, P. Xu, “Can distributed software development be agile?”, Communications of the ACM, October 2006, pp. 41-46
 J. Kierievsky, “Refactoring to patterns”, China Machine Press, 2006
 M. Fowler, K. Beck, “Refactoring: improving the design of existing code”, Addison Wesley Longman, 1999, pp.255 – 260
 M. Fowler, K. Beck, “Refactoring: improving the design of existing code”, Addison Wesley Longman, 1999, pp.238 – 240
 S. E. Berryman, “Designing Effective Learning Environments: Cognitive Apprenticeship Models”, ERIC Document, 1991, pp. 1
 D. Wells, “Pair Programming”, http://www.extremeprogramming.org/rules/pair.html, 1997
 L. Layman, L. Williams, D. Damian, H. Bure, “Essential communication practices for Extreme Programming in a global software development team”, Elsevier, 2006, pp. 1-2
 Williams L. and Kessler R., “Pair Programming Illuminated”, Addison-Wesley, 2002, pp. 113-114
 Johnson, P.M., Reengineering Inspection: The Future of Formal Technical Review, in Communications of the ACM. 1998. pp. 49-52
 Distributed agile development at Microsoft patterns and practices group pp. 10-11
 Williams, L., et al., Strengthening the Case for Pair Programming, in IEEE Software.
Online at http://www.cs.utah.edu/~lwilliam/Papers/ieeeSoftware
 Growing Object-Oriented Software Guided By Tests Steve Freeman, Nat Pryce, Addison Wesley 2009
 Working Effectively with Legacy Code – Robert C Martin, 2004
 McCarthy, Patrick, “The World within Karate & Kinjo Hiroshi” in Journal of Asian Martial Arts, V. 3 No. 2, 1994.
 “Developmental sequence in small groups”. Phycological Bullitin 63 (1965)
Sunday, November 14th, 2010
We’ve been interviewing for developers and BAs lately. I’ve always been a fan of using a test that is as close to their day-to-day job as possible (I blogged about it back in 2005 – here). This is the idea of testing applied knowledge rather than book work / factual recall. For developers we use a small class and test. It’s similar to the Thoughtworks Rover problem but has the advantage that it involves refactoring existing code rather than starting from scratch so you can get through it with them in under an hour. The Business Analyst test is similar. It looks at how they think about laying out a story and more importantly what they think is important.
Finally I’m still a fan of exploring Practices Maps during the interview. They’re a great way to see what people value.
Wednesday, August 25th, 2010
I thought this post from Ade was pretty interesting. It refers back to an exercise Joe Walnes ran at Extreme Tuesday. The concept is simple but pretty cool and a few of us here at RBS had a crack at it.
You can see the results below.
We did it as a bit of fun really but it’s been a useful exercise. I found that it forced me to be a little more tangible about what I actually value at a personal level. Also comparing the maps is interesting. Not that surprisingly there’s a fair bit of similarity between those drawn by these guys (they are all TW or ex) but it’s the differences that are really interesting. I reckon you can feel them in the team dynamic. Useful.
I also tried using the technique in an interview situation. We just got some A3 and a marker and we penned out a map for the prospective employee. It was a pretty cool way of exploring his values and it was interesting to contrast it back to the others. Again the differences were interesting.
Saturday, November 21st, 2009
These were originally given as part of the RBS Enterprise Engineering Program with teams attaching each one and presenting back to the group. They could make good basis for a longer worked question in interview … or maybe you just fancy testing your brain??
These scenarios are open ended and can be answered to different levels in different ways. The key point is to have a think about fundamental issues that affect performance and scalability in each scenario. Then try and fit the technologies to them.
Scenario 1 :
System A calculates market risk real time on trades and market data that are currently stored in a Coherence cache. Currently the client application requests the trade and market data from the cache and computes the risk locally on the trader workstation. They are eight-core machines so this is practical but the intention is to scale this out. The market data cache contains 1000 x 3k objects. The trade cache contains 5,000,000 x 50k objects. Trades and market data change intraday.
HPC have been asked to consider this problem with the view of minimising the latency incurred pricing trades. They are considering solutions that use the compute grid + data grid and solutions that use the data grid on its own. Sketch out a compute grid + data grid solution and a data grid only solution. Reflect on the pros and cons of each. Which would you go for?
System B is a web application. Its homepage returns a set of trades from the database based on the users’ profile. The application then keeps this up to date as trades change in the system. The home page takes up to 60s to load. There are 10 users and 50,000 x 3k trades. Users are very unhappy with how long the home page takes to load. The architect on the team is keen to bolt a caching layer in front of the database so you’ve been called in to advise.
Paying note to the current performance of the system and its architecture, what would you suggest that the team do?
System C is a web based retail banking system such as the sort you might use to manage your finances. It is currently a simple three tiered application with load balanced application servers in the middle tier and 5 machine Oracle RAC cluster at the back end. Another bank has just been acquired and the business wants to roll their users in, doubling the load on the system. Regularly accessed user data is currently in the 100GB range. Suggest a solution that might accommodate these changes. In your answer consider the merits of using the following:
- Scaling out Oracle RAC to 10 servers.
- Using a replicated or partitioned cache to hold state.
- Using the Compute Grid to scale out the application tier.
- Using a messaging system.
System D is a trading application that requires users to be able to conduct “what if” modelling by perturbing either market data and/or trade parameters. This is used in two ways, to determine fair pricing prior to trade execution and to examine the overall risks associated with the trader’s individual position. Currently the trading desk has ten users and is located in a single location. Response time is a critical factor as is the ability to handle instruments of widely differing complexity and duration. Detail the factors that should be considered so as to ensure that the solution meets performance requirements. On the basis of these propose a design for the system ensuring that it can scale with increased number of traders, locations, counterparties and instrument types.
Some sample answers can be found here.