ContribsGH-S
The first way to implement a REST- API in Scala. Synchronous version.
1. Introduction
This second post of the series started here, will explain in detail parts of our first solution to a software development problem consisting of:
The implementation of a REST service based on the GitHub REST API v3 that responds to a GET request at port 8080 and address /org/{org_name}contributors
with a list of the contributors and the total number of their contributions to all repositories of the GitHub organization given by the org_name
part of the URL.
As explained in the previous post, our first and subsequent solutions to the problem share a structure consisting mainly of the following modules:
- A REST client, responsible for managing the necessary requests to the GitHub REST service.
- A REST server, in charge of generating the response to the requests made to our service.
- A processing module, which takes the outputs of the REST client, process them, and builds the structure needed as input by the REST server.
The implementation of our REST client and server components (which, for the most part, will remain almost unaltered in the different solutions proposed to our problem) does not exceed 60 lines of Scala code. The difference between solutions, on the other hand, will reside mainly in the processing module, which in the first solution presented here has only around 10 lines of Scala code.
This post will explain the code of the REST server and the processing module of the first solution, leaving for the third post the explanation of the REST client and the processing module of the second solution.
2. Some details of our first solution explained
Finally, the time has come to start exploring the details of our first solution to the problem.
The functions used by the program will be presented and explained in a top-down way, following the order in which they were designed, opposite to the bottom-up order in which they were implemented. Following this order we will be able to illustrate a design style in which each step of the refinement
process of a program is guided by the types of the needed functions, a design strategy frequently — and fruitfully — used when writing programs in a functional programming language.
But, before (sorry, another aside) a taste of pattern matching in Scala is in order.
Pattern matching by example
Pattern matching is a powerful feature of Scala inherited from other functional programming languages. It is used to selectively evaluate part of the code of a complex “match expression”, depending on whether or not its argument
satisfies the conditions associated to each one of its constituent “case clauses”.Syntactically, a match expression consists of an argument expression (frequently just a variable), the
match
keyword, and one or more case clauses, each one consisting of:
the reserved wordcase
followed by a pattern and an optional guard,
the symbol=>
,
and an expression associated to the pattern.The patterns are matched in sequence against the value of the argument expression. If one pattern matches (and the guard, if exists, is satisfied), the value of its associated expression is returned as the value of the entire match expression. In a sense, a match expression has the same purpose of a “switch” statement or a group of nested if-then-else statements in a traditional non-functional programming language, but it allows for a lot more concise and readable code.
The patterns can be: constants, variables (with or without a declared type), the underscore (which acts as a “don’t care” variable) and class constructors with variables within them (which, in case of a match, act as extractors of parts of the matched pattern that can be used in the remaining parts of the case).
In the example:
We define a Scala function that takes a single argument of type
Any
and returns the result (of typeString)
of evaluating a match expression against the argument.So, the type of the defined function is
(Any) -> String
, i.e. a function that takes anything and returns a string.The first two patterns use the
::
list constructor, explained in our aside about lists.The first pattern has a “guard”, a condition that must hold (in addition to the pattern being matched) for the case to be considered successful. The guard causes the case to fail if the matched list has exactly 2 elements.
The second pattern matches only lists of exactly two elements and extracts the first one to use it in the associated expression (the second element is “extracted but ignored” because it is matched by
_
). The case succeeds when the previous one fails because of the guard.The first pattern extracts from a matched list its second element (in this case the first one is ignored) and the list of the elements after the second (which could be the empty list
Nil
), to use them in the associated expression and in the guard, respectively.The third pattern has a declared type that allows it to match any List, ignoring the type of its elements.
The last
case
matches any argument not matched by the preceding casesTo see
matchList
in action, define it in a Scala console and call it with arguments likeList(1,2,3)
,List(“a”,”b”)
,“a String”
,123
, etc.To further appreciate the power and convenience of pattern matching in Scala you can try the following experiment: write an equivalent function in Java or Python and compare it with this one in terms of number of LOC and readability.
Returning to the discussion of our solution, the REST server module takes just the following lines:
This code uses the Lift function serve,
which takes as argument a partial function associating a request (in the format established for our endpoint) to a LiftResponse,
which in turn is returned by another function — properly named buildRestResponse
— whose only argument is the part of the URL representing the name of the organization. This name is a string extracted from the URL by the pattern-matching construct used to define the partial function.
Partial functions in Scala
In Scala a partial function is a function that can be undefined for some values of its (only) argument. Syntactically it takes the form of a sequence of one or more
case
clauses not preceded by amatch
(the single argument of the partial function is implicitly matched against the case(s)).The partial function given as argument to the Lift function
serve
has a singlecase
that matches aReq
(a requirement to the REST server) composed of the URL (expressed as a list of strings), an ignored second element, and a request type (a GET in our case).
The function buildRestResponse
makes use of the auxiliary function contributorsByOrganization,
that, given an organization, returns a list of all the contributors to all the repositories of the organization.
The elements of this list are converted from instances of the Contributor
class to their JSON representation, and then wrapped inside a JsonResponse
that Lift uses to build the JSON payload of the HTTP response. The conversion
of Contributor
instances to JSON is defined as a method of the Contributor
class, it makes use of a very simple DSL included in Lift for that purpose.
Contributor
is one of our model classes, together with Repository
and Organization.
The definition of all our model classes and other auxiliary types can be found in the Scala object Entities
.
The complete code of contributorsByOrganization
uses logback to log messages that will allow us to trace the execution of our program while responding to a given request, and to know the time employed in doing so. The output of the logger is directed by default to the sbt console where we execute our program.
The code of the auxiliary function contributorsByOrganization
used by the REST server module (with logging statements stripped out) is:
Here, we first get (using our REST client module to be explained in the next post) a List[Repository]
for the organization at play, and then “convert” that list (again using our REST client) into a List[Contributor]
(the detailed list of contributors for the organization). For that purpose, we first use a function of type (Organization) -> List[Repository]
and then map
the returned list with a function of type (Organization, List[Repository]) -> List[Contributor].
Finally, the detailed list of contributors is processed in the way required by our specification, i.e. grouping their elements by the name of the contributors and ordering the resulting groups by the (descending) total number of their contributions. This is achieved by applying to the detailed list of contributors a sequence of transformations that:
- Convert the original list of contributors into a
Map[String, List[Contributor]]
usinggroupBy
. AMap[A, B]
can be seen as a set of key-value pairs of the typesA
andB
, respectively,that doesn’t contain duplicate keys. In our case the keys are strings (names of contributors), and the values are lists of contributors having in common the name of the contributor (precisely the string returned for a contributor by the function given as argument togroupBy).
- Convert the values of this map (lists of contributors with a common name) into the sum of their contributions, using
foldLeft
in a way similar to that exemplified in our aside about lists. - Convert the new map to a list of pairs of contributor names and total number of contributions, using
toList
. - Map those pairs to instances of the
Contributor
class. - Sort the resulting new list of contributors, first numerically by minus the number of their contributions and then alphabetically by their names (using
sortBy
explained in our first aside about lists).
The code of the function contributorsDetailedSeq
, used for building a list of the contributors to the repositories of a given organization and a list of its repositories, is just :
Here flatMap
is used because the result of mapping contributorsByRepo
to a List[Repository]
will be a List[List[Contributor]]
, and what we need is just a List[Contributor]
, a “flattened” version of the previous list. The desired result could have been obtained using the equivalent expression repos.map(contributorsByRepo(organization, _)).flatten
.
3. Further work to enhance our solution
The main problem with our first implementation of the requested service is efficiency. It is a simple and clear solution to the problem, but the REST calls to the Github API are synchronous: every call starts only when the preceding one has finished, blocking in that way the (only) execution thread used.
The next installment of this series presents an asynchronous implementation of the service realized using Scala futures. It was necessary to change just a few lines of the code of our synchronous version to obtain a considerable reduction in the time needed to serve a request. Interested? You can find this second way to implement a REST-API in Scala here.