Inicio este post parafraseando Duncan Watts, autor do livro “Tudo é óbvio*”, onde explora “como o senso comum nos engana” e oferece uma nova maneira de pensar. Dr. Watts foi pesquisador-chefe do Yahoo! Research e foi um dos pioneiros nos estudos das dinâmicas sociais através da facilidade e revolução da internet e das redes. Trouxe questionamentos e ofereceu respostas fundamentadas em dados que, segundo ele, são indispensáveis a “empresários, políticos, cientistas e todos nós.”
Meu primeiro contato com este livro fora logo após o seu lançamento — estava escancarado na entrada de uma das minhas livrarias-café favoritas de Porto Alegre. Sua tão exposta obviedade passou batida, exceto por aquele incômodo asterisco vermelho, já sinalizando a promessa registrada em sua capa.
O exemplar impresso, repousado em minha estante, está marcado por anotações e destaques de uma leitura realizada quando ainda trabalhava com inovação e desenvolvimento de novas tecnologias, em meados de 2012. Éramos um time relativamente pequeno e discutíamos projetos que, anos mais tarde, viriam a impactar o dia a dia de milhões de pessoas. Dito isto, tornam-se óbvias as razões que me levaram a grifar o seguinte trecho:
“(..) Um número pequeno de pessoas sentadas em salas de conferência está usando sua intuição do senso comum para prever, gerenciar ou manipular o comportamento de milhares ou milhões de pessoas distantes e diversas cujas motivações e circunstâncias são muito diferentes das suas.”
D. Watts, 2011. “Tudo é Óbvio* Desde que você saiba a resposta.” P. 35.
Hoje, uma década mais tarde, refaço a leitura e retomo tais anotações — agora com o privilégio de poder validá-las. Minhas atividades e perspectivas mudaram, assim como o próprio mundo mudou. A rede social predominante à época e que inspirou muitas das análises do Dr. Watts, o Twitter, segue sendo extremamente ativa. Agora, porém, disputa a largura de banda com tantas outras redes e se inunda em grandes lagos de informação (mais conhecidos como data lakes). As salas de conferência também foram inundadas por dados e hoje, pode-se dizer, são minimamente híbridas: combinações entre universos e metaversos.
É neste contexto que a intuição do senso comum cedeu lugar à intuição baseada em dados — isto é, aos modelos de machine learning. O objetivo, porém, em nada mudou: seguimos buscando modelos para “prever, gerenciar ou manipular o comportamento” e, dada sua sofisticação, presumimos que sua qualidade tenha melhorado.
Estariam os algoritmos, portanto, isentos do viés causado pelo senso comum?
Minha intenção ao repercorrer os capítulos desta obra — cujas reflexões compartilharei por aqui — é justamente questionar o impacto na ciência de dados causado pelo viés do senso comum, assim como do “problema do enquadramento”, “problema micro-macro”, “viés retrospectivo”, dentre tantos outros apontados pelo autor.
Lean Six Sigma is a continual improvement methodology broadly applied in the production and services industries. In this post series, we will run through it by applying such tools in this blog-writing process.
A year ago, I decided to restart writing a blog. I was excited about sharing my research ideas, some data analysis code, and some daily findings or even ramblings. I should mention that I had yet created an editorial calendar with well-meaning and already started posts.
But as you can suppose by reading this now, I wrote nothing more and a full year has passed before I show up again. I would be lying if saying my willingness to write has changed. Indeed it may have even increased. So why didn’t I write?
A trivial answer would be to blame the lack of time — in fact, this was one of the busiest years of my last decades. However, this alone does not explain the whole story. In this way, I decided to apply the Lean Six Sigma (L6S) methodology to my blog writing activity. After all, having started my Green Belt certification, alongside several other demands, could be part of my naive justification of my (non-)writing.
Lean Six Sigma methodology
In short, L6S is a continual improvement methodology that seeks to make processes leaner and more efficient by reducing its variability and waste. Its history is related to the production management programs developed by Motorola and Toyota, among other companies, throughout the 1980s and 1990s. These, in turn, are based on the statistical process control, whose development and application began in the 1920s. Currently, the L6S methodology is adopted by numerous companies and has an international standardization (ISO 13053, first released in 2011).
The most common L6S methodology is known as DMAIC, standing for its five compounding phases, as follow:
Define: in this phase, an overview of the process is drawn by observing its main problems, as well as establishing the goals and catching a glimpse of the ways out. This phase outcome is a project chart describing the problem, the project scope, and the expected benefits.
Measure: once the problem has been defined, this phase is intended to collect the largest amount of meaningful information regarding it. Such data become a baseline for validating the process improvement.
Analyze: this phase seeks to analyze the data collected during the last one (and which will continue to be collected). The primary outcome must be an in-depth analysis of the problems root causes. For this task, there are many statistical tests available in the L6S toolbox.
Improve: based on the previous phases, here we propose solutions for the identified problems supported by the evidence found. The main outcome is the new process map (as it should be) with a set of actions to be implemented.
Control: in this final phase, a control plan is developed both to measure the effectiveness of the actions undertaken, as well as ensuring they will be kept accordingly.
My certification is still in progress — which, in addition to the theoretical training, involves leading a continual improvement project with proven financial results to the company I work for. Applying such a robust methodology in a blog writing process should be seen as an exercise since it is a personal endeavor that does not carry any financial KPI as most of the corporate processes.
Processes can be continually improved
Before we dive into the first phase of the L6S methodology, it is affordable to discuss how activities can be understood as a process. First of all, a process is a recurrent activity that has inputs; operations or functions to process the inputs; and outputs, these last resulting from the processing of the inputs.
I usually distinguish a process from a system by considering the latter as composed of many processes. In the diagram below, we can find a more complex version of our basic process. In this case, the single input x is replaced by the set X of m inputs. We may also have more than an output, now represented by the set Y of n outputs. In turn, our system is represented by the set F of c functions or operations.
Although one may successfully apply the L6S methodology without using the notation above for the system and its processes, understanding such formalization can be very useful for statistical analysis and automation. In any case, what we want is to improve our system in the way that the functions in F will perform as best as possible. In other words, our ideal system should output the best Y values for a given set X of inputs.
By using the L6S tools we become able to avoid failures and to optimize our processes to a minimum wasting rate. However, since any process in our universe obeys to the thermodynamics laws (remember them here), we must always expect some wastes.
In this context, the sigma value of a given process tells us how many failures (or wastes) are expected on every million occurrences (measured in DPMO – defects per million opportunities). As a reference, in a six sigma process the DPMO value is only 3.4, i.e., in such processes, only 3.4 failures are allowed for every 1 million occurrences.
The blog-writing process
Fortunately, many events can be described as processes, as is the case with scientific writing and blog writing. In both cases, the inputs can be the reasons why someone should write, or so his/her objectives. The operations of functions are the writing process itself. Obviously, the output is the text in its final version.
From my perspective, the most serious flaw is easy to identify: the text is simply not published, i.e., its output is not delivered.
I hope to have the opportunity to detail how fun and surprising it can be to apply such a robust methodology to a relatively simple and personal problem. The message I want to leave, for now, is that a method such as L6S makes us rethink what the real aims of the process are; what tools do we have to evaluate them; and how the results are being achieved. In short, it makes us able to answer these kinds of questions:
What did I want by creating this blog?
Why did the writing not occur?
What can be done to make this process efficient, i.e., to deliver what is expected?
If we look closely, we can see how much these questions are related to strategic planning. Hence the relevance of applying L6S for the most diverse applications. If, in your case, there is a well-defined strategic plan, the L6S will undoubtedly help you to improve it. Otherwise, this will be one of the first needs identified by the methodology and, best of all, by using L6S you won’t have to create a strategic plan from scratch.
Did you get interested? In the next post of this series, I will explore which L6S tools will help me answer the first question. Follow the blog to be notified. Until then!