Big data applications in Apache Spark with Scala

Instructors:

Juan Manuel Serrano

Javier Fuentes

Date:
July-August, 2017
Duration:
24 hours

Overview

This training course aims at introducing the fundamentals of Scala, functional programming and Apache Spark to developers of big data applications. The course has three major general goals: 1) being able to use Scala in the most idiomatic way; 2) knowing the functional programming patterns exposed in the Spark API; and 3) understanding how Spark computations work underneath. The course is structured into two major blocks:

  • PART I. The first one exclusively focuses on the Scala features and functional programming design patterns that are necessary to use Spark in the most effective way: mainly, higher-order functions, type classes and functional DSLs (domain-specific languages). Examples in this part are extracted from the domain of Big data applications whenever possible.
  • PART II. The second one focuses on two components of the Spark architecture: Spark SQL and Spark Streaming. Whenever adequate, this second part of the course also employs functional programming libraries specially designed to use Spark in a more type-safe way (e.g. frameless) and functional design patterns (e.g. state machines for streaming algorithms).

Outline

  • Introduction to Scala
    • Object-oriented programming with Scala
    • Generic programming in Scala
    • Unit testing with ScalaTest
  • Higher-order functions & ADTs
    • Algebraic data types
    • Higher-order functions
    • The standard collections library
  • Type classes
    • Type classes as a design pattern
    • Type classes vs. conventional object-oriented patterns
    • Type constructor classes: Functors
  • Functional architectures
    • Type-class based APIs
    • Domain-specific languages as APIs
    • Imperative programming with Monads
  • Spark SQL
    • Dataframes, datasets
    • Datasources: Hive, parquet, JSON
    • Functional frameworks: frameless
  • Spark streaming
    • DStreams
    • Input sources and output ops
    • Structured streaming

About the coordinator

Juan Manuel is CTO and co-founder of Habla Computing. He has been using Scala for the last six years in real-world applications for the banking sector, and has extensive experience in consultancy projects with Scalaz, Cats, and other frameworks of the Scala ecosystem. He founded and manages the Madrid Scala Meetup group and is member of the organizing committee of lambda.world, the premier Spanish conference of functional programming for the software practitioner. Prior to joining Habla Computing, Juan Manuel was a University teacher for more than fifteen years in different computer science and software engineering degrees.

Terms of use

Welcome to the Habla Computing S.L. website
Owner: Habla Computing, S.L.
Address: Villalobos, 20, 13A 28018 Madrid
Tel. (+34) 91 172 68 52; fax. (+34) 91 172 68 52
E-mail: info@hablapps.com
Registry Data: Madrid Commerce Registry, Volume 27,693, Sheet 12, Section 8, Page M-499100, dated April 12, 2010, first registration.
C.I.F.: B-85933729

Terms of Use
The conditions herein regulate the use of this website. By accessing, browsing, or using this Web site imply that user have read, understood, and agree, unconditionally, to be bound by these terms.
Habla Computing may, without notice to you, at any time revise and/or update these Terms of Use and any other information contained in this Web site. Habla Computing may also make changes in the products, services, or programs described in this site at any time without previous notice.
Habla Computing reserves the right to deny, suspend, interrupt or cancel access or use of this website, either completely or partially, for those users or visitors found to be in non-compliance with any of the conditions set forth in this Legal Notice.

General
Specifically, the reproduction, transformation, distribution, communication, public dissemination and, in general, any form of development, using any procedure whatsoever, of all or part of the contents of this website, as well as any corresponding design and manner of presentation of the materials included herein are strictly prohibited. Said development shall only be permitted if Habla Computing issues written authorization of the same and only as long as appropriate reference is made to the sole property of Habla Computing. Notwithstanding the foregoing, any software and other materials that are made available for downloading, access, or other use from this site with their own license terms, conditions, and notices will be governed by such terms, conditions, and notices.
The decompilation, disassembly, reverse engineering, sub-licensing or transfer of any kind, translation or execution of work derived from the computer programs required for functioning, access and use of this website, and of the services described in the contents, as well as the execution of any of the development activities described in the previous paragraph, whether related to all or part of such programs, is also strictly prohibited. The user of this website must abstain from deleting, modifying, avoiding or manipulating any protection device or security systems that may be installed herein.
The brand names, commercial names and distinctive symbols are the sole property of Habla Computing. Access to the website does not imply the granting of any express or implied rights under any patents, trademarks, copyrights, or other proprietary or intellectual property rights.

Hyperlinks to other sites
The hyperlinks or links included in this website can lead the user to other third party sites, over which Habla Computing has no control. Habla Computing shall not be held liable for the contents or the state in which said sites are found. Access to the said sites using this website also does not, in any way whatsoever, imply that Habla Computing either recommends or approves of corresponding contents or use of such Web site.
Habla Computing is not a party to or responsible for any transactions user may enter into with third parties, even if you learn of such parties (or use a link to such parties) from a Habla Computing site. Accordingly, user acknowledge and agree that Habla Computing is not responsible for the availability of such external sites or resources, and is not responsible or liable for any content, services, products, or other materials on or available from those sites or resources.
It is up to user to take precautions to protect themselves from viruses, worms, trojan horses, and other potentially destructive programs, and to protect your information as you deem appropriate.

Linking to this site
All links to this Web site must be approved in writing by Habla Computing, except that Habla Computing consents to links in which the link and the pages that are activated by the link do not: (a) create frames around any page on this Web site or use other techniques that alter in any way the visual presentation or appearance of any content within this site; (b) misrepresent your relationship with Habla Computing; (c) imply that Habla Computing approves or endorses you, your Web site, or your service or product offerings; and (d) present false or misleading impressions about Habla Computing or otherwise damage the goodwill associated with the Habla Computing name. As a further condition to being permitted to link to this site, you agree that Habla Computing may at any time, in its sole discretion, terminate permission to link to this Web site. In such event, you agree to immediately remove all links to this Web site.

Modifications
Habla Computing reserves the right, at any time, to modify, enlarge or temporarily suspend the website presentation, configuration, technical specifications and services unilaterally and without previous notice.
Also reserves the right to modify the terms of use, as well as any other specific conditions, set forth herein at any time.

Confidential information
Habla Computing does not want to receive confidential or proprietary information from user through our Web site. Please note that any information or material sent to Habla Computing will be deemed NOT to be confidential. By sending Habla Computing any information or material, you grant Habla Computing an unrestricted, irrevocable license to copy, reproduce, publish, upload, post, transmit, distribute, publicly display, perform, modify, create derivative works from, and otherwise freely use, those materials or information. You also agree that Habla Computing is free to use any ideas, concepts, know-how, or techniques that you send us for any purpose. However, we will not release your name or otherwise publicize the fact that you submitted materials or other information to us unless we obtain your permission to use your name.

Liability Exclusion
Whoever uses the website does so at their own expense and risk. All materials, information, products, software, programs, and services are provided “as is” with no warranties or guarantees whatsoever, neither explicit nor implied, and can be modified or updated without previous notice. Habla Computing, its associates and employees shall not be held liable for the errors or omissions that could exist within the contents of either this website or other contents to which access can be gained using the same. Habla Computing, its associates and employees shall also not be held liable for either any damages arising from the use of this website or any activity undertaken as a result of the information made available herein. Accordingly, user should confirm the accuracy and completeness of all posted information before making any decision related to any services, products, or other matters described in this site.
Habla Computing does not guarantee the absence of viruses, worms or other harmful computer elements that could cause damage or alter the computer system in the electronic documents or user files included herein. As a result, Habla Computing shall not be held liable for any user or third party damages or losses arising from such elements. Furthermore, Habla Computing shall not be held liable for or guarantee the availability and continuity of access to this web site. Habla Computing shall also not guarantee that this website is secure or error-free. It shall be the user’s responsibility to have the appropriate tools available for detecting and disinfecting harmful or damaging computer programs.
In no event will Habla Computing be liable to any party for any direct, indirect, incidental, special, exemplary or consequential damages of any type whatsoever related to or arising from this web site or any use of this web site, or of any site or resource linked to, referenced, or accessed through this web site, or for the use or downloading of, or access to, any materials, information, products, or services, including, without limitation, any lost profits, business interruption, lost savings or loss of programs or other data. The user is fully aware of, and voluntarily accepts the fact that they are fully liable for any consequences arising from any use of any part of the contents herein, materials, information, products, software, programs, or services, so at their own discretion and risk. User will be solely responsible for any damages that may result, including loss of data or damage to your computer system.
The user shall be held liable for losses and damages of any kind that Habla Computing could suffer as a result of incompliance with the obligations set forth herein.
Additional or different terms, conditions, and notices may apply to specific materials, information, products, software, and services offered through this Web site. In the event of any conflict, such additional or different terms, conditions, and notices will prevail over these Terms of Use.

Legislation
Under any and all circumstances, the use conditions set forth herein shall be governed by Spanish Law.