engineering – Required Reading https://REQUIRED-READING.BLOG Achieve more with less code Tue, 12 Sep 2023 18:20:50 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.5 There are no small programs https://REQUIRED-READING.BLOG/2023/09/12/there-are-no-small-programs/ https://REQUIRED-READING.BLOG/2023/09/12/there-are-no-small-programs/#respond Tue, 12 Sep 2023 18:20:50 +0000 https://required-reading.blog/?p=53 For many coding best practices you read about you will see exceptions like “for small programs it’s ok to not follow this”. Using global variables or untyped/dynamically typed languages are two of the most commonly mentioned exceptions. But allowing exceptions for small programs opens the question when is a program “small” and what happens when it grows? Even more, I think that saying “it’s ok for small programs” distracts from the real question:

“Why would you want an exception?”

An exception makes sense if having it means less work. Let’s apply this test to the two examples above.

Writing code without use of direct access to global variables is the same amount of work as writing complete functions. So there really is no justification for an exception here.

package di

import (
	"flag"
	"fmt"
)

var (
	// In Golang, this is the closest we come to a global variable.
	SomeFlag = flag.String("string_flag", "", "A flag")
)

func notLikeThis() {
	fmt.Printf("I'm using the global flag directly. Here it is: %s", *SomeFlag)
}

func main() {
	notLikeThis()
}

The code above is no different from the one below in terms of work to write it, but the second version is much more flexible.

package di

import (
	"flag"
	"fmt"
)

var (
	// In Golang, this is the closest we come to a global variable.
	SomeFlag = flag.String("string_flag", "", "A flag")
)

func butLikeThis(msg string) {
	fmt.Printf("I'm data properly. Here it is: %s", msg)
}

func controller() {
	butLikeThis(*SomeFlag)
}

The second example, using untyped languages, which usually means scripting languages that you can run without out compile and build steps, is less clear cut. The problem is that programs grow. Eventually any small program will become a medium or large program. So what to do when that happens? Will you rewrite your entire program in a different language? Will you rewrite parts? Will you keep it all?
In practice, rewriting never happens. Especially in environments where it’s not you, the developer, but some manager who makes that decision. And even if you are rewriting, you wasted all the previous work, because you are completely replacing it. And because it still is a simple program, the learning effect from implementing it twice is very small.
There is an important trade-off you have to make at the start of development. You have to balance the time you save by using an untyped language with the risk posed by the potential growth of the program. That risk comes in two flavors. First, there is a very small chance that your program will stay small forever. In that case you win. There is a big chance that your program will keep growing. At that point you are faced with the choice of rewriting what you already have, or keeping the untyped language. If you make the decision to rewrite early enough, you lose only a little. If you decide to stick with the untyped language, you enter purgatory. The best way to avoid this is to start with a typed, compiled language. Especially when you will not be in control of the decision whether to rewrite, but can influence the initial choice of language. If you check the required reading below, you will find that this opinion is strongly at odds with some of the literature. But I stand by my experience.

These two examples illustrate the questions you should ask yourself before employing any exception to best practices:

  • What will it save me now?
  • What will it cost me later, and how likely is that cost to happen?

The answer to both of these questions will change with experience and practice. Some of the coding practices I discuss in this blog take time to learn and turn into a habit. During that time, they have a non-zero cost to you. So you might feel you save something by not using them. The same goes for coding practices that have a real up-front cost.
Long term costs require even more, and longer term experience to assess somewhat correctly. You have to experience the exponential difficulty of keeping an untyped program stable and reliable. It is hard to reason about logically, but if you experienced the agonizing death of a project because of it, you will understand much better why this is a real risk. The flip side of the coin is that you also have to experience project success where everything seems to be going smoothly, and then take a retrospective look at the techniques you employed in that project compared to one that didn’t go as well. When doing this, don’t forget that it’s not all about programming techniques, the people doing the programming are probably even more important.

If you are unsure which way to lean, remember this:

Best practices are techniques that strike the right balance for the long run.

Required Reading

  1. Martin, Robert C. 2008. Clean Code: A Handbook of Agile Software Craftsmanship. 1st edition. Upper Saddle River, NJ: Pearson.
  2. Thomas, David, and Andrew Hunt. 2019. The Pragmatic Programmer: Your Journey To Mastery, 20th Anniversary Edition. 2nd edition. Boston: Addison-Wesley Professional.
]]>
https://REQUIRED-READING.BLOG/2023/09/12/there-are-no-small-programs/feed/ 0
Understand Dependency Injection – Part 3 https://REQUIRED-READING.BLOG/2023/09/08/understand-dependency-injection-part-3/ https://REQUIRED-READING.BLOG/2023/09/08/understand-dependency-injection-part-3/#respond Fri, 08 Sep 2023 18:35:53 +0000 https://required-reading.blog/?p=51 In parts 1 and 2 of this series on dependency injection I showed how dependency injection is implemented, and why it is a powerful technique. The motivating examples and coding solutions centered around data provided as parameters in some way, often parameters that are an indirection, for example a file name. This part of the series discusses another common way of making data available to functions: global variables.

There are countless articles available online that explain why global variables are bad, and how to deal with them. I’m not going to repeat the details here. Though, one canonical way to remove a dependency on a global variable is to make the data a parameter of the function and have the controller inject the value.

package di

import (
	"flag"
	"fmt"
)

var (
	// In Golang, this is the closest we come to a global variable.
	SomeFlag = flag.String("string_flag", "", "A flag")
)

func notLikeThis() {
	fmt.Printf("I'm using the global flag directly. Here it is: %s", *SomeFlag)
}

func butLikeThis(msg string) {
	fmt.Printf("I'm data properly. Here it is: %s", msg)
}

func controller() {
	notLikeThis()
	butLikeThis(*SomeFlag)
}

The new function butLikeThis() now enjoys the same benefits as the examples in the previous articles: it’s easier to test and configure, and much easier to understand and refactor, because the hard dependency on the global variable is gone.

Two important things to note:

  • Singletons [1], i.e. instances of the Singleton anti-pattern, are global variables and should be treated as such.
  • When writing new code, avoid creating global variables whenever possible. This subject is subtle enough to merit detailed exploration at a later time.

Data sharing – The elephant in the room

Whether we pass parameters to functions, access global variables or look up singletons, we are always sharing data: parameter values are copied or passed by reference from the caller, globals and singletons are directly accessed data. Functions consume and possibly modify this data without knowledge which other part of the system may have access to the same instance of that data.
Sharing mutable objects, in particular in concurrent code, is a source of subtle and often surprising errors [PP]. So it’s worth taking a closer look how introducing dependency injection over the other patterns discussed here, affects sharing.
Let’s get the easy case out of the way first. Replacing explicit access to globals in functions with parameters where (potentially) the controller passes the value of a global as argument doesn’t change the degree of sharing. It’s the same as before. The refactoring suggested above is purely an improvement of the using function.
At a closer look, the examples in the previous articles also do not change sharing properties of the underlying data. In those cases, the “address” of the data is passed as argument, but the data is then read from files. Nothing is stopping other code from modifying those same files while you are reading them. Or more likely, such attempts would lead to errors somewhere. So in healthy cases, where dependency injection is used to improve code structure, the nature of data sharing does not change, unless you take extra steps.
One thing you will encounter when you implement the suggestions in this article series is that the controller will instantiate wrapper objects around data. These objects are easy to share across different parts of your program. This can be a blessing or a curse. The code structure I advocate here makes it more explicit where sharing may occur. Like many other programming techniques, it requires judgment to decide when to share and when to copy the data once you have it available for injection. This is where you come in.

Required Reading

  1. Gamma, Erich, Richard Helm, Ralph Johnson, John Vlissides, and Grady Booch. 1994. Design Patterns: Elements of Reusable Object-Oriented Software. 1st edition. Reading, Mass: Addison-Wesley Professional.
  2. Thomas, David, and Andrew Hunt. 2019. The Pragmatic Programmer: Your Journey To Mastery, 20th Anniversary Edition. 2nd edition. Boston: Addison-Wesley Professional.
]]>
https://REQUIRED-READING.BLOG/2023/09/08/understand-dependency-injection-part-3/feed/ 0
Understand Dependency Injection – Part 2 https://REQUIRED-READING.BLOG/2023/07/06/understand-dependency-injection-part-2/ https://REQUIRED-READING.BLOG/2023/07/06/understand-dependency-injection-part-2/#respond Thu, 06 Jul 2023 14:45:52 +0000 https://required-reading.blog/?p=45 Part 1 covered the structural elements dependency injection and the style in which you write code using dependency injection. In this part, we’ll talk about additional architectural features that are enabled or at least made easier with dependency injection.

We are going to talk about two points:

  • Supporting different configurations, e.g. different database back-ends, in your code
  • Supporting changing code behavior without touching your own code.

The differences between the two are subtle, but because the use cases are conceptually quite different, we’ll discuss them separately.

Injecting configurations

“Configurations” here means actual parts of the system, e.g. different database back-ends, cache implementations, etc. One obvious and common use case for this is in testing, where you often want to replace complicated pieces of code with simpler fakes or stubs. Even if you never intend to support different configurations in your production system, the support for much easier testing means that this is a critical technique you should always apply.

So how does it work?

In part 1 I showed how to inject objects that a piece of code depends on directly. To support different configurations, we have to add an additional abstraction, an interface, to describe what kind of thing has to be injected.
Let’s say we have a request handler that has to look up data is some storage system. At the start, this may look like this:

type UserQueryHandler struct {
	store *DB
}

func HandleQuery(query string) error {
	if err := sanitizeQuery(query); err != nil {
		return err
	}
	d, err := store.Get(query)
	if err != nil {
		return err
	}
	// use d somehow
}

type DB struct {
	conn *sql.DB
}

func (d *DB) Get(key string) (Data, error) {
	// Build SQL query from key.
	// Build Data from SQL return
}

func (d *DB) Set(key string, value Data) error {
	// Write to DB
}

func controller() {
	db := sql.OpenDB("")
	handler := UserQueryHandler{store: &DB{conn: db}}

	// register with HTTP/RPC stack.
}

For testing, you would now always have to provide a DB struct with a SQL backend, which makes tests heavy-weight. Additionally, when running your system, you will likely discover that backend queries are slow and costly, so you might want to introduce a cache.
Since your code is already set up for dependency injection, the changes to get this done are small:

type Store interface {
	Get(key string) (Data, error)
	Set(key string, value Data) error
}

type UserQueryHandler struct {
	store Store
}

func HandleQuery(query string) error {
	if err := sanitizeQuery(query); err != nil {
		return err
	}
	d, err := store.Get(query)
	if err != nil {
		return err
	}
	// use d somehow
}

type DB struct {
	conn *sql.DB
}

func (d *DB) Get(key string) (Data, error) {
	// Build SQL query from key.
	// Build Data from SQL return
}

func (d *DB) Set(key string, value Data) error {
	// Write to DB
}

type InMemoryCache struct {
	db   DB
	data map[string]Data
}

func (c *InMemoryCache) Get(key string) (Data, error) {
	if d, ok := c.data[key]; ok {
		return d, nil
	}
	return c.db.Get(key)
}

func (c *InMemoryCache) Set(key string, value Data) error {
	// Update cache if needed and write to DB if needed.
}

func controller() {
	db := sql.OpenDB("")
	cache := InMemoryCache{db: &DB{conn: db}, data: make(map[string]Data)}
	handler := UserQueryHandler{store: &DB{conn: db}}

	// register with HTTP/RPC stack.
}

The interface now just declares the operations your code needs, and decouples your code completely from how providers implement them. The providers, in this example both DB and InMemoryCache, implement the interface. So the handler could accept either as the backend store. This is the same principle that frameworks use when they provide hooks that you can implement.

In Part 1 I argued that you don’t need injection frameworks to make use of the power of dependency injection. Consider briefly what the injection framework would have to do to support different configurations for production and testing, and maybe even allow flag-controlled settings like --use_in_memory_cache vs --use_memcached.
Powerful frameworks like Guice make this possible, but the cost in terms of indirection is high. Tracing back where objects come from is hidden behind a lot or reflective, and thus invisible, automation.

Don’t believe me? As an exercise, either use a framework like Guice for the use cases I outlined above, or, if you want a real challenge, implement a simple injection framework for your favorite language that can support this.

Injecting different behaviors

Injecting different behaviors is very similar to injecting configuration. In the above example, we already did this, because the pass-through logic of the cache is already a slightly different behavior than before. However, in this case, the use of the interface was aimed more at injecting different providers, rather than modifying algorithms.
The Strategy pattern focuses explicitly on injecting algorithms rather than data objects. A very simple case is the sort.Slice() function from Go’s standard library. The sorting is generic, but you have to provide a comparator function for it to work. In the example below, we’ll do this by passing a lambda instead of an explicit function.

type data struct {
	i int
	s string
}

func controller() {
	ints := []int{3, 6, 1, 65, 1, 21}
	d := []data{
		{1, "g"},
		{76, "a"},
		{3, "b"},
		{19, "t"},
		{123, "e"},
	}

	sort.Slice(ints, func(i, j int) bool {
		return ints[i] > ints[j]
	})
	sort.Slice(d, func(i, j int) bool {
		if d[i].i == d[j].i {
			return d[i].s < d[j].s
		}
		return d[i].i < d[j].i
	})
}

This example is a bit simple, but it shows the principle. In languages where functions are first-class objects, you can directly inject them. Otherwise, you’ll have to go the indirect route via defining an interface with a single function, have an object implement this interface, and inject the object instance.

What’s next?

So far we’ve covered the use cases where dependency injection is powerful, but we’ve ignored some questions of context. Object lifetime and sharing are important practical considerations when you separate object creation from use. We will cover these topics in part 3 of the series.

Required Reading

  • Dependency Injection on Wikipedia
  • Gamma, Erich, Richard Helm, Ralph Johnson, John Vlissides, and Grady Booch. 1994. Design Patterns: Elements of Reusable Object-Oriented Software. 1st edition. Reading, Mass: Addison-Wesley Professional.
]]>
https://REQUIRED-READING.BLOG/2023/07/06/understand-dependency-injection-part-2/feed/ 0
Understand Dependency Injection – Part 1 https://REQUIRED-READING.BLOG/2023/06/27/understand-dependency-injection-part-1/ https://REQUIRED-READING.BLOG/2023/06/27/understand-dependency-injection-part-1/#respond Tue, 27 Jun 2023 12:58:32 +0000 https://required-reading.blog/?p=43 Dependency injection (DI) is a key principle of clean code. It is also often misrepresented and misunderstood, and thus often disregarded. So, let us begin with the definition of dependency injection:

Dependency injection is the coding technique where objects (via constructors or setters) and methods/functions receive other objects they depend on as arguments. [1]

This is it. I want to emphasize here that the principle of dependency injection is completely independent of frameworks that might help (or hinder) this. This distinction is important, because in most discussions the principle of dependency injection and the need for/use of frameworks are conflated into the same thing. They are not the same, and in this article I will show that they are by no means necessary to unlock the power of dependency injection.

Let’s begin with an example. The problem to solve is straightforward: read data from a file and render it for output, for example simple ASCII text for console output and testing, or HTML for presentation in a browser. With that specification, you will often write or see code like this:

func Render(fileName, filter string, mode int) ([]byte, error) {
	d, err := Process(fileName, filter)
	if err != nil {
		return nil, err
	}
	switch mode {
	case HTML:
		// render HTML output
		return nil, nil
	case CONSOLE:
		// render console output
		return nil, nil
	default:
		return nil, fmt.Errorf("invalid mode: %d", mode)
	}
}

func Process(fileName, filter string) (data, error) {
	d, err := Read(fileName)
	if err != nil {
		return data{}, err
	}

	for _, r := range d {
		if b, err := regexp.Match(filter, []byte(r)); b && err != nil {
			continue
		}
		// do something with filtered data
	}
	return sth, nil
}

func Read(fileName string) ([]string, error) {
	f, err := os.Open(fileName)
	if err != nil {
		return nil, err
	}

	scn := bufio.NewScanner(f)
	scn.Split(bufio.ScanLines)
	out := []string{}
	for scn.Scan() {
		out = append(out, scn.Text())
	}
	f.Close()

	return out, nil
}

Think about how you would unit test Render.

Probably your thoughts are along the lines of

  • Create a few test files with different inputs and files with matching outputs, and run them through Render. OR
  • Have some constant strings in the test, and on the fly write them into temporary files who’s names are passed to Render. OR
  • Use an in-memory file system implementation where paths are automatically recognized and redirected in the os.Open call or pass an extra argument to Read that tells it whether to read from the real OS file system or a test in-memory file system.

All of this just to get a data object that you can then render. If you also think about how you would add the ability to render data coming from a network stream or database, you will end up with more arguments, more if-else or switch statements, and even more complicated test setups.

This is by no means a contrived example. I have seen plenty of code exactly like this throughout my career.

To rewrite this code to use dependency injection you have to do two things:

  • Create a controller function that knows how to create the data objects.
  • Change the signature of the functions to receive the data objects they need.

func controller(fileName, filter string, mode int) {
	fc, err := Read(fileName)
	if err != nil {
		log.Fatal(err)
	}

	d, err := Process(fc, filter)
	if err != nil {
		log.Fatal(err)
	}

	output, err := Render(d, mode)
	if err != nil {
		log.Fatal(err)
	}
	// continue with rendered output
}

func Render(input data, mode int) ([]byte, error) {
	switch mode {
	case HTML:
		// render HTML output
		return nil, nil
	case CONSOLE:
		// render console output
		return nil, nil
	default:
		return nil, fmt.Errorf("invalid mode: %d", mode)
	}
}

func Process(input []string, filter string) (data, error) {
	for _, r := range input {
		if b, err := regexp.Match(filter, []byte(r)); b && err != nil {
			continue
		}
		// do something with filtered data
	}
	return sth, nil
}

func Read(fileName string) ([]string, error) {
	f, err := os.Open(fileName)
	if err != nil {
		return nil, err
	}

	scn := bufio.NewScanner(f)
	scn.Split(bufio.ScanLines)
	out := []string{}
	for scn.Scan() {
		out = append(out, scn.Text())
	}
	f.Close()

	return out, nil
}

Now think about how you would unit test Render. The test for the rendering logic is now completely independent of how you obtain the data in the first place. It no longer matters if the data is read from a file or a network stream or database, the test will remain the same. All that extra knowledge is now located and contained in the controller function.

Why is this so important?

It makes code easier to understand, change and test. [3]

And where do injection frameworks come in? Usually, they make the controller function obsolete by using reflection or some type system features to identify argument types and constructors or factories for them. The advantage of the framework approach is that there is a little bit less code to write. But that’s also a disadvantage, because the code in the controller function usually tells you exactly when and how an object is constructed, while frameworks hide this. When debugging, that hidden information is often important, yet very difficult to track down when hidden inside the framework.

In my experience and preference, this disadvantage far outweighs the benefits of frameworks, so I always favor creating objects explicitly in code.

Dependency injection is also a key technique to make programs more flexible, by injecting configuration and pieces of code, for example in the Strategy design pattern [2]. We’ll cover those uses in Part 2 of the series.

Required Reading

  1. Dependency Injection on Wikipedia
  2. Gamma, Erich, Richard Helm, Ralph Johnson, John Vlissides, and Grady Booch. 1994. Design Patterns: Elements of Reusable Object-Oriented Software. 1st edition. Reading, Mass: Addison-Wesley Professional.
  3. Thomas, David, and Andrew Hunt. 2019. The Pragmatic Programmer: Your Journey To Mastery, 20th Anniversary Edition. 2nd edition. Boston: Addison-Wesley Professional.
]]>
https://REQUIRED-READING.BLOG/2023/06/27/understand-dependency-injection-part-1/feed/ 0
Grow unit tests organically https://REQUIRED-READING.BLOG/2023/06/07/grow-unit-tests-organically/ https://REQUIRED-READING.BLOG/2023/06/07/grow-unit-tests-organically/#respond Wed, 07 Jun 2023 14:58:17 +0000 http://required-reading.blog/?p=23 Strict test-first development suggests that when you define a new function, you proceed as follows:

  1. Define the function signature.
  2. Define test cases for the function.
  3. Implement the function until all tests pass.

In practice, this doesn’t work, and trying it wastes time. Instead, I recommend to grow tests organically together with your code. To illustrate this, I’ll use some snippets from a fictitious coding session and comment on the state of things after each iteration.

Let’s say we want to build some sort of roguelike game. We could begin like this:

package main

type Game struct{}
func (*Game) Run() error {
    return nil
}

func main() {
    g := Game{}
    g.Run()
}

This code is pretty basic and has only one function that could potentially be tested: Game.Run(). Test-first would require a test for this now, and a specification for the behavior we want this to have. That’s problematic for two reasons:

  1. Run() is the game loop. At this stage, we simply don’t know what this is going to do and how. So specifying what this function should do at a level of detail for a unit test is probably not possible, it’s certainly not efficient.
  2. Run() is currently empty, and it will continue to do very little for quite a while. Writing a unit test for this would end either with a change-detector test, or a noop-test. Neither adds value, and the change detector creates a lot of churn in addition.

So instead of writing a unit test, we just move on.

package main

type Tile struct{}
type Map struct {
	Tiles [][]Tile
}

func (m *Map) Draw() {
	for _, row := range m.Tiles {
		for _, t := range row {
			// Draw t
		}
	}
}

func TestMap() *Map {
	// returns some test data, omitted for brevity
	return nil
}

type Game struct {
	Map *Map
}

func (*Game) Run() error {
	return nil
}

func main() {
	g := Game{Map: TestMap()}
	g.Run()
}

This iteration adds a few more functions, some actually containing a little bit of code. But there is still very little complexity, and at the same time, many things are still in flux. Consider, for example, drawing the map. In the actual game, it will draw graphics, but for testing, we probably want text output. At this stage, it’s already clear that the code we have isn’t sufficient and will change in the next iteration. It’s also not entirely clear how we will accommodate different drawing methods. Sure, we could write a test, but it would do very little, and would have to change a lot in the next iterations. So the argument from the previous section remains: it’s just not worth it just yet.

package main

import "fmt"

type Drawer interface {
	Draw(t Tile)
}
type ConsoleDrawer struct{}

func (*ConsoleDrawer) Draw(t Tile) {
	fmt.Printf("%+v", t)
}

type Tile struct{}
type Map struct {
	Tiles [][]Tile

	drawer Drawer
}

func (m *Map) Draw() {
	for _, row := range m.Tiles {
		for _, t := range row {
			m.drawer.Draw(t)
		}
	}
}

func TestMap() *Map {
	// returns some test data, omitted for brevity
	return &Map{drawer: &ConsoleDrawer{}}
}

type Game struct {
	Map *Map
}

func (*Game) Run() error {
	return nil
}

func main() {
	g := Game{Map: TestMap()}
	g.Run()
}

Now we’ve taken a stab at abstracting the drawing behavior and implemented a very basic Drawer. With this, we can seriously start thinking about writing tests for the drawing behavior. But we immediately see that drawing currently produces only console output. That’s hard to test, so we need something more accessible before we can write a test.
Notice, though, how the discussion starts to shift from growing code to testing it.

package main

import (
	"fmt"
	"strings"
)

type Drawer interface {
	Draw(t Tile) error
}

type ConsoleDrawer struct {
	drawer StringDrawer
}

func (c *ConsoleDrawer) Draw(t Tile) error {
	if err := c.drawer.Draw(t); err != nil {
		return err
	}
	fmt.Print(c.drawer.String())
	return nil
}

type StringDrawer struct {
	buffer *strings.Builder
}

func (s *StringDrawer) Draw(t Tile) error {
	if s.buffer == nil {
		return fmt.Errorf("Drawer not initialzied")
	}
	s.buffer.WriteString(fmt.Sprintf("%+v", t))
	return nil
}

func (s *StringDrawer) String() string {
	return s.buffer.String()
}

type Tile struct{}
type Map struct {
	Tiles [][]Tile

	drawer Drawer
}

func (m *Map) Draw() {
	for _, row := range m.Tiles {
		for _, t := range row {
			m.drawer.Draw(t)
		}
	}
}

func TestMap() *Map {
	// returns some test data, omitted for brevity
	return &Map{drawer: &ConsoleDrawer{}}
}

type Game struct {
	Map *Map
}

func (*Game) Run() error {
	return nil
}

func main() {
	g := Game{Map: TestMap()}
	g.Run()
}

This iteration shows another reason why early testing often is a waste of time. If we had tests already, even empty ones, we’d need to update them now that the signature of Draw() changed. Given that so far we would have had no benefit from any test, that would be wasted time, even if IDEs would do most of the heavy lifting. If they even can do the heavy lifting depends a lot on the language and test framework. In my experience, you’d be left with a lot of manual work.
Further, the current iteration comes with it’s own problems that suggest that more work is required. With the new StringDrawer at least we can produce string output, and write tests that compare those strings. However, you probably noticed that this code is full of flaws. The string buffer gets initialized only once and keeps accumulating writes forever. Also, now the drawer contains state, and that state cannot be meaningfully managed by the drawer itself. The drawer operates on tiles, so it doesn’t even need state. It could just return the string for the last tile, leaving it to the client to construct larger strings. So this already highlights a few more iterations to tidy up this code before it reaches a state that might be stable enough to write an actual test for drawing a map.

This discussion leaves us with a few obvious questions:

  • Should you have unit tests at all?
    Absolutely. Unit tests are still the foundation of rapid and agile development, and you cannot succeed without them.
  • When should you start writing unit tests?
    My rule of thumb is to create a unit test as soon as you can clearly define what the behavior of a function should be.
  • How many unit tests should you have?
    As long as adding more tests has significant additional value, add more. There is not fix number or percentage or such.

Someone might argue that all the steps we went through in the above example could be done in one go by thinking about the problem hard for a while before writing any code. And while that’s a theoretical possibility, I don’t work that way, and I never met anyone who works that way. Also, this is a snippet from a coding session that takes maybe a few minutes, and is at the very start of a project, so there is no additional context to the code. In real projects, almost all the code you write has context, which makes it much harder to consider all implications up front. So rather than propagating a theoretical ideal, I’m giving practical advice here.

If there is one thing I want you to take from this article, it is that you should be smart about when to write tests and which tests to write. Not all code needs unit tests. For example, the Game.Run() function will probably be better served with E2E tests.

A few rough and ready guidelines:

  1. Write unit tests when you understand the intended behavior of a function or method, not before.
  2. All public functions or methods of a class/module/package should be covered by reasonable unit tests when 1) is true.
  3. Internal methods may significantly benefit from unit tests too. This will be very contentious with some people. I guess we simply have to disagree here.
  4. Be aware of change detector tests. If you have a function where sensible test you can come up with becomes a change detector, you probably should refactor your function.
  5. Remove tests that have not enough value.

Required Reading

Change-Detector Tests Considered Harmful

]]>
https://REQUIRED-READING.BLOG/2023/06/07/grow-unit-tests-organically/feed/ 0
What makes a successful software prototype? https://REQUIRED-READING.BLOG/2023/06/05/what-makes-a-successful-software-prototype/ https://REQUIRED-READING.BLOG/2023/06/05/what-makes-a-successful-software-prototype/#respond Mon, 05 Jun 2023 15:18:03 +0000 http://required-reading.blog/?p=18 A prototype has one job to do: Prove that it is feasible to implement your idea.
There may be additional benefits, like getting an idea which API or UI work, and which don’t. But at the end of the day, the prototype is to prove feasibility. Most importantly, a prototype is not the tool to convince stakeholders that your idea is a good one. That’s a non-technical issue, and therefore outside the domain of prototyping.

What is feasibility?

The American Heritage Dictionary defines feasible as “capable of being accomplished or brought about”. That isn’t quite what I mean, though. For a software project to be feasible, it has to be capable of being accomplished within constraints, namely time, effort, and thus money. That means that your prototype has to demonstrate that your idea can be brought about, and it has to inform you at least very roughly how much effort, and thus cost, is required.

What your prototype needs

For simplicity, I’ll assume that your idea is computationally feasible. That’s by no means a given, but understanding computability requires theoretical analysis of the problem you are trying to solve. A prototype will not answer that question, so here we only concern ourselves with cost estimation and efficiency.
The most important questions your prototype should answer are:

  • How much of the use case can be covered by existing libraries and services, and how much do they cost?
    If there are libraries and services you want to use, use them in your prototype, otherwise you will not know if they are fit for purpose.
  • How much code and infrastructure will you have to build yourself?
    This is what’s left over after using all existing libraries and services. The holy grail here is to estimate this accurately. Nobody really has a good way to do this. My best practice formula goes like this: 1) Ask all engineers familiar with the project for their most conservative estimate, 2) take the longest of those estimates and double it 3) that’s the earliest time you will have a beta version to give customers to try. This formula is right about 50% of the time, the rest of the time, it underestimates, like every other method out there.
  • Where are the unknowns that the prototype does not cover?
    These are things that are very hard to predict, but you vaguely know that they are there. For example, it’s easy enough to train an ML model on a restricted input set for a prototype. But training a model that is capable of meaningfully processing inputs at scale is a different problem altogether, and it is very hard to predict how long tuning will take and how much it will cost. These are largest contributors to project risk, and unfortunately, the prototype does little to help reduce that risk.

This information gathered from your prototype is the foundation of your project plan and budget. You might want to throw away the prototype itself, though. Most software projects attempt to build the “real” implementation on top of a prototype, to save time. That is risky, and I’ll discuss strategies for building re-usable prototypes and knowing when to discard a system in later articles. For now, just consider what is more costly: refactoring the prototype to the “final” architecture with all the details, or building it clean from scratch?

Lastly, do you need a prototype?

Almost certainly, yes. Same as no battle strategy survives enemy contact, no software project plan survives implementation. Things will go wrong, and you will have to adapt and change. Often significant parts of your plan prove infeasible during implementation. A prototype is your chance to test the feasibility of the most critical parts of your plan, and adjust before you invest heavily. Chances are that what you learn from your prototype leads you to throw it away and start the “real” implementation from a clean slate, but with what you already learned in mind.
Agile processes like Extreme Programming are often presented as embracing change to the extent that throw-away software is no longer a thing, and that the process is so flexible that you can adapt to anything along the way. From my experience, no practical process can achieve that. Applied judiciously, agile processes can reduce the risk and cost, but not fully alleviate it.

In the end, it’s up to you.

Required Reading

  • Plan to Throw One Away, in Jr, Frederick Brooks. 1995. The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition.
  • Beck, Kent, and Cynthia Andres. 2004. Extreme Programming Explained: Embrace Change, 2nd Edition.
]]>
https://REQUIRED-READING.BLOG/2023/06/05/what-makes-a-successful-software-prototype/feed/ 0
Code less, achieve more https://REQUIRED-READING.BLOG/2023/06/01/code-less-achieve-more/ https://REQUIRED-READING.BLOG/2023/06/01/code-less-achieve-more/#respond Thu, 01 Jun 2023 16:54:23 +0000 http://required-reading.blog/?p=15 Do the least amount of work possible

That is the key principle that will make you more productive, will help you deliver your project on time and within budget, and achieve more. This may sound counter-intuitive at first. Bear with me.
We coders love coding stuff. Stuff that’s functional and clever and a challenge to build. That’s great, this is what keeps us in the game. However, to deliver a system, we have to focus this drive in the right direction.
Your software is supposed to solve a problem. Be that sending and receiving email, payroll processing, or the newest AAA game. What I mean by “do the least amount of work” is that whatever your problem, build the things that directly contribute to the solution, nothing more.
That’s easier said than done, because in practice it is often unclear how much something contributes to the final result, or if at all. For example, building debugging functionality into a system could be considered critical or extra, depending on your point of view. Factors like the life-cycle stage your software is in — for example, is it a prototype or a launched product — affect this as well. All that means is that there is no hard and fast rule to decide what is worthwhile to build. It’s all in the context.

How to

To trim down the work you have to do, you should ask the following three questions for every feature and improvement you are considering.

  1. Will my software do its job without this feature?
    This is the most obvious question, and the single most powerful tool in your toolkit. But in the excitement of brainstorming ideas, all to often we forget to ask it and act on the answer. Again, if you can launch without a feature, don’t build it.
  2. Does the feature have to support all corner cases, or can I apply the 80/20 rule?
    Once you have determined that you really need a feature, try to keep it as small as possible. Sometimes this is a pure engineering question, sometimes this is a product question. It’s not always OK to exclude some corner cases, but if you can, do it.
  3. Are there libraries or services that provide this feature?
    If a stable library or service exists that already implements your feature, or much of it, not using that library is a major risk. Writing your own version guarantees you will make your own mistakes, many of which the library maintainers have already made and fixed. You are creating extra work and risk, which will slow you down.

There true power lies

Judiciously not writing code means you are not creating bugs, not creating complexity, and not creating maintenance burden. In addition to launching faster because you save time writing code, you also launch faster because you save time by not having to fix bugs you didn’t create.

Required Reading

  • Plan to Throw One Away, in Jr, Frederick Brooks. 1995. The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition.
  • Martin, Robert C. 2008. Clean Code: A Handbook of Agile Software Craftsmanship.
]]>
https://REQUIRED-READING.BLOG/2023/06/01/code-less-achieve-more/feed/ 0