23 Jun 2021
01 Mar 2021
We tested OpenAI’s GPT-3 (one of the largest Artificial Intelligence (AI) language models ever created) on ten Spanish customer conversations about banking. Without any previous customization, re-training or transfer learning whatsoever, we asked it to generate summaries that would be useful to any bank agent who joins (or retakes) the conversation.
16 Jun 2020
A few weeks ago some colleagues from a development team told us about their worries on the JSON Web tokens (JWT) generation they were doing as part of a new tool integration they were working on. They had heard about several security issues regarding the use of JWT tokens so they asked us for help in order to validate if the tokens they were issuing were correct and met some basic security requirements.
We are currently working on a project to help automate security tests, APICheck, which we’ve recently released as open source. APICheck is comprised of a set of small tools that can be pipe chained in order to run several tests on API requests, so we got down to work with the development of a new tool for validating the JWT they were issuing, jwt-checker, in which we’ve implemented the ability to pass the validations on the tokens we’ll talk about. Below I’ll show you an example of a test implemented using the tool.
06 Apr 2020
Transactionality in databases is fundamental for critical systems, but the historical asymmetry between reads (massive) and writes (specific, but can impact in the middle of a read operation) could be solved with a new approach. Last year Databricks released to the community a new data persistence format built on Write-Once Read-Many (HDFS, S3, Blob storage) and based on Apache Parquet.
19 Sep 2019
Cost and revenue factors impacting economic activities are often too complex to be properly modelled, in which case optimization through Artificial Intelligence (AI) can lead to academically correct solutions that fall far from being economically optimal solutions.
05 Apr 2019
In recent months, BBVA Next Technologies has devoted some effort into researching tools and techniques for interpreting machine learning models. These techniques are very useful to understand the predictions of a model (or make others understand them), to extract business insights from a model that has managed to capture the underlying patterns of customers interest, and to debug models in order to ensure they make the right decisions for the right reasons.
In this article we will explain how we have applied these techniques to avoid deploying flawed models into production that seemed totally correct a priori according to standard validation methods.
01 Mar 2019
04 Feb 2019
27 Nov 2018
The problem of data integrity verification is a classic topic that has been well-studied in the last decades. Many database or log systems run on untrusted servers or are subject to malicious attacks from insiders and, therefore, vulnerable to tampering. With the advent of cloud computing and the possibility of dealing with outsourced data, such situation has been aggravated. This problem is an essential challenge for financial institutions where preserving the correctness of customer data and transactions is crucial in terms of legal compliance and reputation.
25 Jun 2018
01 Jun 2018
This post in 10 seconds
In this post we will address software dependency management: the security problems it implies, how to automate its management, receiving new vulnerabilities alerts in real time and how to implement a productive and satisfactory system using new Open Source tools as Deeptracy and Patton.
06 Apr 2018
There are a few Container Orchestration Engines available, such as Kubernetes, OpenShift, DC/OS, Nomad or Swarm, to name a few. On occasion, the optimal solution to deploy your apps requieres picking not one of those, but two or more. Each have their strenghts, and are targetted toward a specific kind of load. For instance, Kubernetes is specially prepared to deploy microservices very dynamically. However, in such a dynamic environment it is hard to handle a persistent database, task which is better handled by system such as DC/OS.
06 Mar 2018
At BBVA Labs, we have been using Akka Actors in different projects for a long time due to its inherent computation distribution capabilities in high load scenarios, using actors as stateless or stateful units of execution that send messages among them in an asynchronous fashion.
In this post, we will cover some of the weak points of working with untyped classic actors and we will show what are the benefits that Akka Typed brings to the scene. Finally, we will present a brief description of session types and the Process DSL API created by Dr. Roland Kuhn. All examples are written using the Scala API.
30 Nov 2017
Today, a large majority of modern apps or services are RESTful APIs and use API definitions to facilitate communications between them, as it saves us from having to worry about the language and the underlying implementation of the remaining components. APIs make even more sense in microservice or serverless architectures with dozens or hundreds of mutually interacting microservices/functions.
06 Nov 2017
At BBVA Labs, we follow the test pyramid concept proposed by Mike Cohn. We have a large collection of unit tests that are easy to implement and which are run at every change in code; a collection of acceptance tests which are run whenever the previous tests are passed; and finally, end-to-end tests that are only run to unlock a function.
The complexity of implementing these tests increases as you move up in the pyramid. End-to-end tests, where service integration is also tested, requires putting in place the infrastructure, the services to be tested and the integrated services. Setting up a testing environment, test implementation and execution are significantly more complex than unit tests.
In addition to the cost of running these tests, another problem arises when a service changes the message format. Big bang deployment (deploying the service and its dependents at the same time) is to be avoided as this type of change breaks the continual deployment) . Therefore, for a period of time, the service provider has to offer support for two versions of the message while customers update to the new one, but consumer tests only look at one version of the producer.
At BBVA Labs, we conducted an experiment to reduce the number of services to deploy in tests and to ensure that communication among services from different domains is maintained throughout the software product’s life cycle in a continual deployment system. The decision was made to evaluate current tools to conduct Consumer Driven Contract testing (CDC testing o contract testing) for this experiment.
23 Oct 2017
Using Docker in the deployment of software in productive systems solves many problems related to agility and the normalization of these processes. But, like all technology that breaks with prior IT processes, it generates new challenges or requires different solutions for persisting problems. One of those is the management of secrets.
01 Oct 2017
This article is the second part about serverless, where we will cover the integration of one of the most interesting products implementing this technology (Fission) in OpenShift, RedHat’s PaaS platform.
Prior to this article, this analysis of serverless architectures or FaaS (Function as a Service) was prepared, to introduce this technology together with its value in business developments.
27 Sep 2017
Serverless architecture, also known as FaaS (Functions as a Service) enables the execution of an application via ephemeral, stateless containers; these are created at the moment an event is produced that sets off this application. Contrary to what the term suggests, serverless does not mean “without a server.” Rather, these are used as an anonymous element of the infrastructure, supported by the advantages of cloud computing.
Here, leaving aside the hype that accompanies the term serverless, we explore in this post the possibilities it offers in architecture and applications development. We also study the principal alternatives for use in public and private clouds.
Neural networks training is a time consuming activity, the amount of computation needed is usually high even for today standards. There are two ways to reduce the time needed, use more powerful machines or use more machines.
The first approach can be achieved using dedicated hardware like GPUs or maybe FPGAs or TPUs in the future. But it can also be done by splitting the task between more general purpose hardware like the one used in cloud systems.
This document summarizes the conclusions reached after researching the use of distributed neural networks.