Latest Posts

Integrating Agile and UX Methodology

Both Agile development and User Experience (UX) design methodologies have been popular in software design and project management spheres as of late, and despite their popularity, there has been little effort to integrate them. Each approaches development from different perspectives—Agile is more focused on coding and project management, while UX is concerned with the usability of the product and actual user interface. Of course, these are two sides of the same coin. A software solution is unusable without some form of user interface, and a user interface is but an empty shell without a quality software product behind it.  

Ferreira, Sharp, & Robinson (2011) suggest a framework for Agile and UX integration, founded on five principles:

  1. The user should be involved in the development process; 
  2. Designers and developers must be willing to communicate and work together extremely closely; 
  3. Designers must be willing to feed the developer with prototypes and user feedback; 
  4. UCD practitioners must be given ample time in order to discover the basic users’ needs before any code; and, 
  5. Agile/UCD integration must exist within a cohesive project management framework.

The framework itself has both the typical software developers and UX designers running in parallel iterations, giving and receiving feedback in each iteration, and the UX team working one Sprint ahead of the development team. Both start with a Sprint 0 to obtain context and task analysis for the project ahead. This ultimately generates User Stories, which are then distributed across the Sprints. These Stories first go through the UX team before being delivered to the development team, so that the UX designers can begin with the User Stories and intended outcomes to produce the interface. The authors observed both design and development teams in action, and noted many areas for integration and improvement. Most notably, the authors found that UX designers had no User Stories specific to them and found it difficult to design one sprint ahead. Usually, they were either on the same Sprint as the developers or one behind.

The authors present a solid argument for integrating Agile and UX methodologies, and since the article publication in 2011, the idea has caught on (e.g., Gothelf 2018). A variety of Agile- and UX- flavored methods are out there in the DevOps world at any moment and have dedicated followers and applications. 

References 

Ferreira, J., Sharp, H., & Robinson, H. (2011). User experience design and agile development: managing cooperation through articulation work. Software: Practice and Experience, 41(9), 963-974. doi:10.1002/spe.1012

Gothelf, J. (2018). Here is how UX design integrates with Agile and Scrum. Retrieved from https://medium.com/swlh/here-is-how-ux-design-integrates-with-agile-and-scrum-4f3cf8c10e24

MetroMaps and T-Cubes: Beyond Gantt Charts

Martínez, Dolado, & Presedo (2010) discuss two visual modeling tools for software development and planning, MetroMap and T-Cube. This discussion is in the context of greater attention being paid to the development process and metrics, not just the software engineering itself. A concession the authors make very early on is that Gantt charts are the prevalent method for project mapping in organizations, and that the research to date shows they are not effective for communicating, especially when different groups are involved. Enter the MetroMap, a way of visualizing abstract train-of-thought information that communicates both high-level and detailed information to viewers.

Image courtesy of Martínez, Dolado, & Presedo (2010)

T-Cube visualization is reminiscent of a Rubik’s Cube, utilizing the three-dimensional nature of a physical cube, the individual cubes making up the whole, and the facets (colors) on each individual cube. These correspond to tasks and attributes. The authors utilized a specific software set to illustrate these concepts, represented in the article. As the tasks and attributes are written independently, they can be represented by workgroup, type of task, module or time.

These two methods have their strengths and weaknesses, both individually and together. At first glance, it is obvious that the MetroMap can represent many indicators at once while the T-Cube can only show one at a time. MetroMap uses a variety of icons and styles to represent information while the T-Cube uses traditional treemaps. The authors size up the tools in a simple comparison table, noting that MetroMap generally has the edge on viewing a lot of information at once.

Features and benefits are great, but how does actual use differ? Is one easier than the other in practice? The authors examined a shortest-path route to accomplish the same task in both tools, and found that MetroMap was the most efficient in multiple scenarios. In all cases the actions were more basic and straightforward. Overall, either tool is more informative and effective than Gantt charts. Access to information and ability to understand it are paramount in any planning and development exercise. These are two tools that better enable that.

Reference

Martínez, A., Dolado, J., & Presedo, C. (2010). Software Project Visualization Using Task Oriented Metaphors. JSEA, 3, 1015-1026.

Delphi Methods and Ensemble Classifiers

Ensemble classifiers are a bit like Delphi methodology, in that they utilize multiple models (or experts) to arrive at a model that offers better predictive performance than would a single model (Dalkey & Helmer, 1963; Acharya, 2019). These are independent or parallel classifiers, implementing a majority vote amongst the classifiers like the Delphi method. A variety of individual classifiers can be used, including logistic regression, nearest neighbor methods, decision trees, Bayesian analysis, or discriminate analysis. According to Dietterich (2002), ensemble classification overcomes three major problems: Statistical, Computational, and Representational. The Statistical problem involves the hypothetical space being too large for the data itself, producing multiple accurate hypotheses yet only one being chose. The Computational problem involves the algorithm’s inability to guarantee the best hypothesis. The Representational problem involves the hypothetical space being devoid of any good approximation of the target.

Ensemble methods include bagging, boosting, and stacking. Bagging is considered a parallel or independent method; boosting and stacking are both sequential or dependent methods. Parallel methods are used when the independence between the base classifiers is advantageous, including error reduction; sequential methods are used when dependence between the classifiers is advantageous, such as correcting mislabeled examples or converting weak learners (Smolyakov, 2017).

Random forests are not exactly ensemble classifiers but do produce results from multiple decision trees and aggregate the results, like Bagging (Liberman, 2017). These train on different datasets and features, both randomly selected. Bias and variance errors are mitigated by way of low correlation between the models. Again, like ensemble classifiers and even Delphi method decision-making, learners operating as a committee should outperform any of the individual learners.

References

Acharya, Tarun (2019). Advanced ensemble classifiers. Retrieved from https://towardsdatascience.com/advanced-ensemble-classifiers-8d7372e74e40

Connolly, T. & Begg, C. (2015).  Database Systems: A Practical Approach to Design, Implementation, and Management (6th ed.). London, UK: Pearson. 

Dalkey, N., & Helmer, O. (1963). An experimental application of the Delphi method to the use of experts. Management Science9(3), 458-467.

Dietterich, T. G. (2000). Ensemble methods in machine learning. International workshop on multiple classifier systems (pp. 1-15). Springer Berlin Heidelberg.

Dietterich, T. G. (2002). Ensemble Learning. In The Handbook of Brain Theory and Neural Networks, Second Edition, (M.A. Arbib, Ed.), (pp. 405-408). Cambridge, MA: The MIT Press.

Liberman, N. (2017). Decision trees and random forests. Retrieved from https://towardsdatascience.com/decision-trees-and-random-forests-df0c3123f991

Smolyakov, V. (2017). Ensemble learning to improve machine learning results. Retrieved from https://blog.statsbot.co/ensemble-learning-d1dcd548e936

Tembhurkar, M. P., Tugnayat, R. M., & Nagdive, A. S. (2014). Overview on data mining schemes to design business intelligence framework for mobile technology. International Journal of Advanced Research in Computer Science, 5(8).

Decision making with Delphi

The Delphi method brings subject matter experts with a range of experiences together in multiple rounds of questioning to arrive at the strongest consensus possible on a topic or series of topics (Okoli & Pawlowski, 2004; Pulat, 2014). The first round is typically used to generate the ideas for subsequent rounds’ weighting and prioritizing, by way of a questionnaire. This first round is the most qualitative of the steps. Subsequent rounds are more quantitative. According to Pulat (2014), ideas are listed and prioritized by a weighted point system with no communication between the subject matter experts. This is meant to avoid confrontation (Dalkey and Helmer, 1963). Results and available data requested by one or more experts can be shown to all experts, or new information that is considered potentially relevant by an expert (Dalkey & Helmer, 1963; Pulat, 2014). 

While Delphi begins with and keeps a sense of qualitative research about it, traditional forecasting utilizes mostly quantitative methods, utilizing mathematical formulations and extrapolations as mechanical bases (Wade, 2012). Using past behavior as a predictor of future positioning, a most likely scenario is extrapolated (Wade, 2012; Wade, 2014). This scenario modeling confines planning to a formulaic process much like regression modeling. Both Delphi and traditional forecasting utilize quantitative methods, the difference being to what degree. A key question in deciding which method to use is what personalities are involved. Delphi methodology gives the most consideration to big personalities and potentially fragile egos, avoiding any direct confrontation or disagreements.

References

Dalkey, N., & Helmer, O. (1963). An experimental application of the Delphi method to the use of experts. Management Science9(3), 458-467.

Okoli, C., & Pawlowski, S. D. (2004). The Delphi method as a research tool: an example, design considerations and applications. Information & Management42(1), 15-29.

Pulat, B. (2014) Lean/six sigma black belt certification workshop: Body of knowledge. Creative Insights, LLC.

Wade, W. (2012) Scenario Planning: A Field Guide to the Future. John Wiley & Sons P&T. VitalSource Bookshelf Online.

Thick Data and Big Data

In March 1968, Robert F. Kennedy said, of the Gross Domestic Product index: “It measures neither our wit nor our courage, neither our wisdom nor our learning, neither our compassion nor our devotion to our country, it measures everything in short, except that which makes life worthwhile.”

“What is measurable is not always what is valuable.” Wang (2016b) paraphrased Kennedy, originally referencing GDP and its inability to measure the qualitative human condition. With the exponential increase in attention to Big Data as of late, the focus on speed and scale have left out things that are “sticky” or “difficult to quantify” (Wang, 2016b). This disparity reflects the traditional gap between qualitative and quantitative research. In fact, Wang found referring to the qualitative efforts in traditional terms (e.g., ethnography) was met with enough skepticism and pushback that a new term friendly to data jargon had to emerge—and thus the term thick data was born.

https://miro.medium.com/max/1442/1*B4UOLidQEam25fJkNeZH8A.png
Courtesy Tricia Wang

At first glance, thick data is not attractive in the traditional sense of big data. It is inefficient, does not scale up, and is usually not reproducible. However, when combined with big data, it fills the gaps that the quantitative measures leave open. While big data can identify patterns, it cannot explain why those patterns exist. If big data can go broad, thick data can go deep. Thick data relies on human learning and complements the findings from machine learning that big data cannot provide adequate context for. It shows the social context of specified patterns and is able to handle irreproducible complexity. It is the qualitative complement to quantitative data, the color and nuance to a black-and-white picture.

Forces against the adoption of thick data typically stem from bias against qualitative data. Again, it is messy…inefficient, sticky, complicated, and nuanced. Most of the big data world values what can be quantified and the relationships that can be mapped. As (Wang, 2016a) notes, quantifying is addictive, and it can be easy to throw out data that doesn’t fit a numerical value. It isn’t a zero-sum game, however. Both big data and thick data complement each other. But “silo culture”—the same phenomenon that disrupts data integration and wreaks havoc across enterprise data environments—threatens the symbiosis between these two (Riskope, 2017). While thick data is not an innovation in the same sense of cutting-edge artificial intelligence or new developments in IoT technology, it is an innovation in how we think about the world around us and what is important when studying that world.

References

Riskope. (2017). Big data or thick data: Two faces of a coin.  Retrieved from https://www.riskope.com/2017/05/24/big-data-or-thick-data-two-faces-of-a-coin/

Wang, T. (2016a). The human insights missing from big data.  Retrieved from https://www.ted.com/talks/tricia_wang_the_human_insights_missing_from_big_data

Wang, T. (2016b). Why big data needs thick data.  Retrieved from https://medium.com/ethnography-matters/why-big-data-needs-thick-data-b4b3e75e3d7

Five thousand days of the World Wide Web

In 2007, Kevin Kelly looked back on the last 5,000 days of the World Wide Web and asked: what’s to come? Now, with years of hindsight since that talk, we ask: what next?

One thing I have to call attention to here is the latter part of the talk, in which Kelly discusses codependency and the exchange of privacy for convenience. Total personalization equals total transparency. From a development and data perspective, nothing is outlandish about that statement. But as we have seen in the social fabric over the last few years, not everyone understands or agrees with that logic. There is a demand for personalization without the transparency. I believe the watershed moment in that space will be a split between those who eschew all personalization in order to maintain privacy, and those who are determined to innovate a way around having personalization and privacy to the degree that we expect now.

That is not my prediction for an innovation in the next two decades. For that, think back to 2012, when Google Glass was first introduced to the public. It was a product ahead of its time and failed to gain traction. Less than ten years later, Google is refining the product for a more sophisticated release and targeted audiences are paying attention. Looking ahead to 2030 and beyond, augmented reality products will be as commonplace as the personal vital signs wearable (Apple Watch) or natural language processor in the living room (Amazon Alexa). Forces working in their favor are both tangible and intangible. Augmented reality is already here, most notably in current iPhone models. This has introduced the concept in an incremental and friendly way in an existing device as opposed to a bombshell new product class. Consumers are able to experience the tangible technology on devices they are already familiar with, gain confidence, and accept the new products that push the envelope. These are a mix of technological, cultural, and social forces.

These same forces can work against adoption. The development of augmented reality now centers around headsets and devices with cameras, but what of the technologies that can project fully-functional desktops and workstations into the ephemera to be touched and manipulated as though they were physically there? The interface running Tony Stark’s lab in Iron Man is not run through Google Glass but is just simply there. Assuming these can be done, take my earlier point about transparency and privacy, and apply it to these technologies that, by definition, augment the very reality we function in. If people are uncomfortable now with the personalization/transparency tradeoff, a new device that alters how they see and interact with the world might simply be a bridge too far.

References

Dimandis, P. H. (2019). Augmented 2030: the apps, headsets, and lenses getting us there. Retrieved from https://singularityhub.com/2019/09/13/augmented-2030-the-apps-headsets-and-lenses-getting-us-there/

Kelly, K. (2007). The next 5,000 days of the web. Retrieved from https://www.ted.com/talks/kevin_kelly_the_next_5_000_days_of_the_web

When Collaborative Learning isn’t an Open-Office Plan for Kids

According to Adams Becker et al. (2017, p. 20), “the advent of educational technology is spurring more collaborative learning opportunities,” driving innovation in a symbiotic relationship that pushes development in both areas as a product of the other. Collaborative learning and the technological developments that help drive it are trends but not fads.

The confluence of collaborative learning and educational technology.

It may be easy to draw parallels between collaborative learning and open-plan offices. This corporate architecture fad of recent years does appear to be in the same spirit as collaborative learning, and the technological tools that are used in conjunction with open-plan offices—or in spite of them—do have a supporting relationship. But the similarities stop there. Open-plan offices had the best intentions of creating more face-to-face interaction with colleagues but has been proven to reduce such interaction by a drastic margin, pushing employees to use alternative text-based methods of communication in light of social pressure to “look busy” amongst coworkers (James, 2019).

There is one carry-over from corporate collaboration that is fruitful in collaborative learning spaces: synchronous communication via messaging apps such as Slack. These interactions are purposeful and augment the authentic active learning students engage in with collaborative learning. Just as workers do not operate in silos within an organization, students are encouraged to engage with others in various collaborative methods. Educational research and practice reinforce these lessons learned from the corporate world, and are helpful forces driving innovation and advancement in collaborative learning practice and technology.

References

Adams Becker, S., Cummins, M., Davis, A., Freeman, A., Hall Giesinger, C., & Ananthanarayanan, V. (2017). NMC Horizon report: 2017 higher education edition. Austin, TX: T. N. M. Consortium.

James, G. (2019). Open-plan offices literally make you stupid, according to Harvard.  Retrieved from https://www.inc.com/geoffrey-james/open-plan-offices-literally-make-you-stupid-according-to-harvard.html

XML and Standardization

XML is a true double-edged sword in the data analytics world, with both advantages and disadvantages not unlike relational databases or NoSQL. The global advantages and disadvantages inherent in XML are just as applicable in the healthcare field. For example, consider the flexibility of user-created tags on the fly—something that is both an advantage (for ease of use, compatibility, expandability, et cetera) and disadvantage (lack of standardization, potential incompatibility with user interfaces, et cetera) in the global sphere. These are equally applicable in healthcare settings. Considering an electronic health record (EHR), different providers and points of care may add to the EHR without having to conform to the standards of other providers; that is, data from a rheumatologist may be added to the patient record in with the same ease as a general practitioner or psychologist. The portability of the XML format means that the record can be exchanged amongst providers or networks as long as the recipient can read XML. However, this versatility comes at a price, as the lack of standardization means that all tags and fields in any given record must be known prior to query and can be quite a time-consuming process.

Considering an analogy to a different industry, think of a consumer packaged goods (CPG) manufacturer. The CPG has its own internal master data schemas in relational databases and reserves XML for its reseller data interface, so that the different wholesalers and retail network can share sales data back to the CPG in a common format. While all participants use a handful of core attributes (e.g., manufacturer SKU and long description), each wholesaler and retailer has its own set of attributes that are proprietary. XML allows the different participants to feed data back to the CPG without conforming to a schema imposed across the entire retail network and allows the CPG to glean the requisite data shared amongst all participants. However, the process requires setting up the known tags for each new participant so that the CPG knows ahead of time what specific tags are relevant to each participant.

References

Brewton, J., Yuan, X., & Akowuah, F. (2012). XML in health information systems. Paper presented at the World Congress in Computer Science, Computer Engineering, and Applied Computing, Las Vegas, NV.

Jumaa, H., Rubel, P., & Fayn, J. (2010, 1-3 July 2010). An XML-based framework for automating data exchange in healthcare. Paper presented at the The 12th IEEE International Conference on e-Health Networking, Applications and Services.

Stockemer, M. (2007). How Do HL7 and XML Co-Exist in Clinical Interfacing? Retrieved from https://healthstandards.com/blog/2007/08/10/how-do-hl7-and-xml-coexist-in-clinical-

The Role of Data Brokers in Healthcare

In courses I’ve led before, we looked at the disjointed data privacy regulations in the United States and current events in data privacy (e.g., Facebook, Cambridge Analytica, personal genomics testing, etc). The overall issue is repeatable in any setting: giving a single entity a large amount of data inevitably raises questions of ethics, privacy, security, and motivation.

Where healthcare data brokers are concerned, the stated goals differ by type of data. Where direct patient interaction with the data is concerned, the goal is to give patients “more control over the data” (Klugman, 2018) and perhaps bypass the clunky patient portals set up by providers. Of the data that is not personally identifiable, it can have much less altruistic goals, such as being a player in a multi-billion-dollar market (Patientory, 2018) or contributing to health insurance discrimination (Butler, 2018). I am not naïve enough to think that all exercises in healthcare should be altruistic, and the concept of insurance itself has a certain modicum of discrimination in its core; however, weaponizing the data to aid in unfair practices is beyond the pale here.

No alt text provided for this image

From a data engineering perspective, a broker in the truest sense of the word may act as a clearinghouse between providers with disparate systems, enabling the seamless transfer of patient data between those providers without putting the burden of ETL on either of them. Whereas XML formatting and other portability developments have allowed providers using different EHR systems to port patient data, a data brokerage would act as an independent party acting on the patient’s behalf and handling the technical details on integrating their data between all providers and interested parties. Beyond holding the data, the broker would be responsible for ensuring each provider and biller has access to the same single source of truth on that particular patient.

This would, of course, require a data warehouse of sorts for the single source to be held, and puts the questions of security, privacy, transparency, and ethics on the broker. The broker has to make money to survive and a business model must emerge, so it would not be immune to market forces. The aggregation of so much patient data in one place would be too great a temptation to let sit and not make money as de-identified commodities, so a secondary market would emerge and lead to the same issues cited above. Call me pessimistic, but the best predictor of future actions is past behavior, and thus far the companies holding massive amounts of data about our lives either can’t keep it secure from breaches or are perfectly happy selling it while turning a blind eye to what is done with it.

References

Butler, M. (2018). Data brokers and health insurer partnerships could result in insurance discrimination. Retrieved from https://journal.ahima.org/2018/07/24/data-brokers-and-health-insurer-partnerships-could-result-in-insurance-discrimination/

Klugman, C. (2018). Hospitals selling patient records to data brokers: A violation of patient trust and autonomy. Retrieved from http://www.bioethics.net/2018/12/hospitals-selling-patient-records-to-data-brokers-a-violation-of-patient-trust-and-autonomy/

Patientory. (2018). Data brokers have access to your information, do you? Retrieved from https://medium.com/@patientory/data-brokers-have-access-to-your-health-information-do-you-562b0584e17e

Data-in-Motion or Data-at-Rest?

Reading the available material on data-in-motion reminds me of when I first read about data lakes over data warehouses, or NoSQL over SQL: the urgency of the former, and outright danger of the latter, are both overblown. Put simply, data-in-motion provides real-time insights. Most of our analytics efforts across data science spheres apply to stored data, be it years, weeks, or hours old. Taking a look at data-in-motion means not storing it prior to analyzing it, and extracting insights as the data rolls in. This is one workstream of dual efforts that tell us the whole picture: historical data providing insight on what happened and potential to train for future detection, and real-time data to get what’s happening right now.

Churchward (2018) argues a fair point here: once data is stored, it isn’t real-time by definition. But taking that argument to a logical extreme by asking whether we would like to make decisions on data three months old is a stretch. While it is true that matters such as security and intrusion detection must have real-time detection, categorically dismissing data-at-rest analytics is reckless. It vilifies practices that are the foundation of any comprehensive analytics strategy. Both data-at-rest and data-in-motion are valuable drivers of any business intelligence effort that seeks to paint a total picture of any phenomena. 

There are, of course, less frantic cases to be made for data-in-motion. Patel (2018) illustrates a critical situation on an oil drilling rig, in which information even a few minutes old can be life-threatening. In this case, written for Spotfire X, there may be some confusion of monitoring versus analytics. The dashboard shown on the website and written scenario paint more a picture of monitoring and dashboarding than the sort of analytics we would consider deploying Spark or Kafka for. I don’t need a lot of processing power to tell me that a temperature sensor readings are increasing.

Performing real-time analytics on data-in-motion is an intensive task, requiring quite a bit of computing resources. Scalable solutions such as Spark or Kafka are available but may eventually hit a wall. Providers such as Logtrust (2017) differentiate themselves as a real-time analytics provider by pointing out the potential shortfalls of those solutions and offer a single platform for both data-in-motion and data-at-rest.

References

Churchward, G. (2018). Why “true” real-time data matters: The risks of processing data-at-rest rather than data-in-motion. Retrieved from https://insidebigdata.com/2018/03/22/true-real-time-data-matters-risks-processing-data-rest-rather-data-motion/

Logtrust. (2017). Real-time IoT big data-in-motion analytics.

Patel, M. (2018). A new era of analytics: Connect and visually analyze data in motion. Retrieved from https://www.tibco.com/blog/2018/12/17/a-new-era-of-analytics-connect-and-visually-analyze-data-in-motion/