Monday, June 6, 2016

Keynote from Google Research on Building Knowlege Bases at #ICWE2016


I report here some highlights of the keynote speech by Xin Luna Dong at the 16th International Conference on Web Engineering (ICWE 2016). Incidentally, she is now moving to Amazon for starting a new project on building an Amazon knowledge base.

Building knowledge bases still remains a challenging task.
First, one has to decide how to build the knowledge: automatically or manually?
A survey in 2014 reported the following list of large efforts in knowledge building: the top 4 approaches are manually curated, the bottom 3 are automatic.


Google's knowledge vault and knowledge Graph are the big winners in terms of volume.

When you move to long tail content, curation does not scale. Automation must be viable and precise.
This is in line with our own research line we are starting on Extracting Changing Knowledge (we presented a short paper at a Web Science 2016 workshop last month). Here is a summary of our approach:

Where knowledge can be extracted from? In Knowledge Valut:
  • largest share of the content comes from DOM structured documents
  • then textual content
  • then annotated content
  • and a small share from web tables
Knowledge Vault is a matrix based approach to knowledge building, with rows = entities and columns= attributes.



It assumes the entities to be available (e.g. in Freebase), and builds a training over that.

One can build KBs by building buckets of triples, with similar probability of being correct. It's important to precisely estimate correctness probability.
Errors can include mistakes on:
  • triple identification
  • entity linkage
  • predicate linkage
  • source data 
Besides general purpose KBs, Google built lightweight vertical knowledge bases (more than 100 available now).

When extracting knowledge, the ingredients are: datasource, extractor approach, the data items themselves, facts and their probability of truth.



Several models can be used for extracting knowledge. Two extremes of the spectrum are:
  1. Single-truth model. Every fact has only one truth. We trust the value of the highest number of datasources.
  2. Multilaeyer model. separates source quality from extractor quality and data errors from extraction errors. One can build a knowledge-based trust model, defining trustworthiness of web pages. One can compare this measure with respect to page rank of web pages:

In general, the challenge is to move from individual information and data points, to integrated and connected knowledge. Building the right edges is really hard though.
Overall, a lot of ingredients influence the correctness of knowledge: temporal aspects, data source correctness, capability of extraction and validation, and so on--

In summary: Plenty of research challenges to be addressed, both by the datascience and modeling communities!


To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Tuesday, May 17, 2016

Modeling and data science for citizens: multicultural diversity and environmental monitoring at ICWSM

This year we decided to be present at ICWSM 2016 in Cologne, with two contributions that basically blend model driven software engineering and big data analysis, to provide value to users and citizens both in terms of high quality software and added value information provision.







We joined with two papers, respectively:
Model Driven Development of Social Media Environmental Monitoring Applications presented at the SWEEM (Workshop on the Social Web for Environmental and Ecological Monitoring) workshop.

Slides here:




and:

Studying Multicultural Diversity of Cities and Neighborhoods through Social Media Language Detection, presented at the CityLab workshop at ICWSM 2016. The focus of this work is to study cities as melting pots of people with different culture, religion, and language. Through multilingual analysis of Twitter contents shared within a city, we analyze the prevalent language in the different neighborhoods of the city and we compare the results with census data, in order to highlight any parallelisms or discrepancies between the two data sources. We show that the officially identified neighborhoods are actually representing significantly different communities and that the use of the social media as a data source helps to detect those weak signals that are not captured from traditional data. Slides here:



We now continuously look for new dataset and computational challenges. Feel free to ask or to propose ideas!

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Wednesday, March 30, 2016

Ready to crowdsourcing your modeling language notation?


As model-driven engineering practitioners, we sometimes encounter weird modelling notations for the languages we use... and this is also definitely true for modelling language adopters!

We always end up wondering who could ever think about such or such terrible syntax for a language, also for very well established notations (including, for instance, some pieces of UML or BPMN). I take it for granted this is a common experience (raise your hand if not).

 This lead to the idea that also syntax definition should be a more collaborative task. Therefore, we decided to give it a try and test whether crowdsourcing techniques can be used to create and validate language constructs, in particular, its concrete syntax (i.e. notation). 



As part of our research work in this area, together with Jordi Cabot's group, we have setup as an experiment a crowdsourcing campaign using our tool CrowdSearcher

This boils down to a very simple case: we are asking anyone on the web to look into a very small subset of  BPMN, and to participate into 3 simple tasks, including questions for selecting the best notation for some of the BPMN concepts (it won’t take more than 3 minutes). 

Please help us responding these 3 quick questions!
(and feel free to share the link with anyone else)

You can access to the campaign in the following link:


Some disclaimers:
1. we don't care if you are the world's expert in BPMN or if you never heard about it. We want you!
2. we ask you to register before taking the task (just click on the Register button once you enter the task), simply to make sure we only have one performance per person. All the analysis will run on anonymous data.
3. The results of the survey will be made publicly available in the following months. 



To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Wednesday, February 24, 2016

No, MDE is not Engineering!

Following up on my previous post on the actual "Engineering" contribution of Model Driven Engineering, here is the final result of the 2-day poll posted on twitter:



While this is definitely not a statistically significant benchmark, I think it's a significant insight on the field and on how ourselves (MDE practitioners and researchers) see the field.
Basically, there is absolutely no agreement and common understanding!!

On the question on whether MDE is a sound engineering discipline, one third of responders said yes, one third said no, and one third is not sure. Perfectly even distribution!

In summary, if you don't count uncertainty, here is what we collected:

No, MDE is NOT Engineering!


Anyone wants to comment on this?

You can also go through the discussion between Lionel Briand, Paola Inverardi and Manfred Broy on MDE maturity in my previous post about the panel at ModelsWard 2016.

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Friday, February 19, 2016

How Mature is of Model-driven Engineering as an Engineering Discipline? - Panel with Manfred Broy, Paola Inverardi and Lionel Briand

Within ModelsWard 2016, just after the opening speech I gave on February 19 in Rome, the opening panel has been about the current maturity of model-driven engineering. I also hosted a poll on twitter on this matter (results are available in this other post).  

I'm happy the panelists raised several issues I pointed out myself in the introduction to the conference: as software modelling scientists, we are facing big challenges nowadays, as the focus of modelling is shifting, due to the fact that now software is more and more pervasive, in fields like IoT, social network and social media, personal and wearable devices, and so on.

Panel included the keynote speakers of the conference: Manfred Broy, Paola Inverardi and Lionel Briand, three well known names in the Software Engineering and Modeling community.



Manfred Broy highlighted:

  • there is a different between scientific maturity and practical maturity. Sometimes, the latter in companies is far beyond the former. 
  • a truck company in Germany has been practicing modelling for years, and now has this take on the world: whatever is not in the models, doesn't exist
  • The current challenges are about how to model cyber-physical systems
  • The flow of model must be clarified: traceability, refinement, model integration are crucial. You must grant syntactic and semantic coherence
  • You also need a coherent infrastructure of tools and artefacts, that grants logic integration. You cannot obtain coherence of models without coherence of tools.
  • You need a lot of automation, otherwise you won't get practical maturity. This doesn't mean to have end-to-end, or round-trip complete model transformations, but you need to push automaton as much as possible
Lionel Briand clarified that:

  • by definition, engineering underpins deep mathematical background as a foundation and implies application of the scientific method to solving problems
  • maturity can be evaluated in terms of: how much math underpinning is foundational, how many standards and tools exist and are used, whether the scientific approach is used
  •  Tools, methods, engineers, and scale of MDE are increasing (aka. MDE is increasingly more difficult to avoid)
Paola Inverardi recalled a position by Jean Bezivin:
  • we need to split Domain Engineering (where the problem is) and Support Engineering (where the solution will be)
  • MDE is the application of modelling principles and tools to any engineering field
  • So: is actually SOFTWARE the main field of interest of model-driven engineering?
  • In the modern interpretation of life, covering from smart cities to embedded, wearable, and cyber-physical systems, is the border between the environment and the system still relevant?
  • In the future we will need to rely less and less on the "creativity" of engineers when building models, and more and more on the scientific/ quantitative/ empirical methods for building models

The debate obviously stirred around this aspects, starting from Bran Selic who asked a very simple question:
Isn't it the case that the real problem is about the word "modeling"? In any other fields (architecture, mechanics, physics) modelling is implicit and obvious. Why not in our community? At the end, what we want to achieve is to raise abstraction and increase automation, nothing else.
Other issues have been raised too:

  • why is there so much difference in attitude towards modelling between Europe and US?
  • what's the role of notations and standards in the success / failure of MDE? 
What's your take on this issue?
Feel free to share your thoughts here or on Twitter, mentioning me  (@MarcoBrambi).
AND:
Respond to my poll on twitter!


To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Friday, January 22, 2016

"What's special about us?" Harvard computational science symposium on Brain+Computer systems

On Friday, January 22, 2016 I attended a very interesting symposium organised by Harvard University Institute for Computational Science on "BRAIN + MACHINES: EXPLORING THE FRONTIERS OF NEUROSCIENCE AND COMPUTER SCIENCE".

Although it fell outside my main research fields, I found it very interesting and enlightening. And the discussed topics could also imply some crucial role for modelling practices.
The introductory speech by David Cox, addressed the role and span of brain studies. First, he pointed out that when we say we want to study the brain, at a deep level, we say we want to study ourselves.
Indeed, we all perceive human species is special. But why is that? We are not the biggest, longest-living, most numerous, most adapted species. We simply cover a niche, as any other species.

What's special about us is the complexity, not in general sense (nature is plenty of complexity), but specifically complexity of our brain.
Our brain includes 100 billions neutrons, and 100 trillions connections.
We are able to deal with complex information in incredible ways, because each neuron is actually a small computer, and globally our brain is enormously more powerful than any computer built so far.
We therefore build clusters of computers. But this is still not enough to obtain the brain power, we need to understand how brain works, to treat and replicate it.
Typical and crucial problems include to study: vision and image processing, positioning and mobility, and so on.
That's where I think modelling can play a crucial role here.
As we clearly pointed out in our book Model-driven Development in Practice, modelling and abstraction is a natural way of working for our brain. And I got confirmation from renowned luminaries from Harvard today.
I really think that, should we discover the modelling approaches of our mind, we could disclose a lot of important aspects of several research fields.
Just imagine if:

  • we could represent human brain processes through models
  • we could replicate these processes and apply modelling techniques for improving, transforming and exploiting such models. 
This would pave the way to infinite applications and researches. However, one big challenge opens up for the modelling community: are we able to deal with models including trillions of items??

Any further insights on this?

If you want further details on the event, checkout the official website of the symposium here and my storified social media report here or in the preview below:




To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Monday, January 4, 2016

ECMFA: 12th European Conference on Modelling Foundations and Applications

This year I'm involved in the program committee of the Foundations track of ECMFA.
ECMFA 2016 is the 12th European Conference on Modelling Foundations and Applications and is co-located with STAF 2016, on 4-8 July, 2016, in Vienna, Austria. Here are some core excerpts from the call for papers, which could be of interest for software modelling practitioners.



The ECMFA conference series is dedicated to advancing the state of knowledge and fostering the industrial application of Model-Based Engineering (MBE, an approach to the design, analysis and development of software and systems based on high-level models and computer-based automation). Its focus is on engaging the key figures of research and industry in a dialog which will result in stronger and more effective practical application of MBE, hence producing more reliable software based on state-of-the-art research results.

The official conference web site is available at: http://ecmfa2016.itu.dk/

ECMFA 2016 will be co-located with ICMT, TAP, SEFM, ICGT and TTC as part of
the STAF federation of conferences, leading conferences on software
technologies (http://stafconferences.info). The joint organization of
these prominent conferences provides a unique opportunity to gather
practitioners and researchers interested in all aspects of software
technology, and allow them to interact with each other.

ECMFA has two distinct Paper Tracks: one for research papers (Track F)
dealing with the foundations for MBE, and one for industrial/applications
papers (Track A) dealing with the applications of MBE, including experience
reports on MBE tools.

Research Papers (Track F)
In this track, we are soliciting papers presenting original research on all
aspects of MBE. Typical topics of interest include, among others:
  • Foundations of (Meta)modelling
  • Domain Specific Modelling Languages and Language Workbenches
  • Model Reasoning, Testing and Validation
  • Model Transformation, Code Generation and Reverse Engineering
  • Model Execution and Simulation
  • Model Management aspects such as (Co-)Evolution, Consistency, Synchronization
  • Model-Based Engineering Environments and Tool Chains
  • Foundations of Requirements Modelling, Architecture Modelling, Platform Modelling
  • Foundations of Quality Aspects and Modelling non-functional System Properties
  • Scalability of MBE techniques
  • Collaborative Modeling
Industrial Papers (Track A)
In this track, we are soliciting papers representing views, innovations and
experiences of industrial players in applying or supporting MBE. In
particular, we are looking for papers that set requirements on the
foundations, methods, and tools for MBE. We are also seeking experience
reports or case studies on the application, successes or current
shortcomings of MBE. Quantitative results reflecting industrial experience
are particularly appreciated. All application areas of MBE are welcomed
including but not limited to any of the following:

  • MBE for Large and Complex Industrial Systems
  • MBE for Safety-Critical Systems
  • MBE for Cyber-Physical Systems
  • MBE for Software and Business Process Modelling
  • MBE Applications in Transportation, Health Care, Cloud & Mobile computing, etc. ...
  • Model-Based Integration and Simulation
  • Model-Based System Analysis
  • Application of Modeling Standards
  • Comparative Studies of MBE Methods and Tools
  • Metrics for MBE Development
  • MBE Training
Research papers should be up to 16 pages long; Industrial
papers should be 12 pages long (full papers), or 2 pages long (short
papers). Short papers will be given shorter presentation slots.
The authors of selected best papers from the foundations track will be
invited to submit extended version to a special issue of the SoSyM journal
(with another review process).

Important dates for authors:

Abstract submission deadline: February 15, 2016 AoE
Papers submission deadline: March 1, 2016 AoE
Notification to authors: April 7, 2016
Camera ready versions due: April 28, 2016

The complete call for papers is available here in text and here as pdf.



To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Wednesday, November 4, 2015

Automatic Code Generation for Cross-platform, Multi-Device Mobile Apps. An Industrial Experience

With Aldo Bongio (WebRatio), Jordi Cabot (ICREA and UOC), Hamza Ed-douibi (EMN) and Eric Umuhoza (Politenico di Milano), we worked on a research on Automatic Code Generation for Cross-platform, Multi-Device Mobile Apps.

We presented our study at the MobileDeLi workshop, where we reported on a comparative study conducted to identify the best trade-offs between different automatic code generation strategies.
Here are the slides presented there:

We covered the following strategies by implementing them using different technologies and target platforms:
  1. PIM-to-Native Code (NC)
  2. PIM-to-PSM-to-NC
  3. PSM-to-NC.
  4. PIM-to-Cross Platform Code (CPC)
  5. PIM-to-Framework Specific Model (FSM)-to-CPC
Some additional details are available in this post by Eric on Jordi's blog.

Our study showed that there is no approach better than others in absolute terms but provided useful guidelines (e.g. cross platform approaches are generally advisable for companies with limited resoures) that helped us to identify the best strategy for the WebRatio company in particular.

Obviously, further investigations are ongoing...

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Tuesday, October 27, 2015

Open position for Full Professor at Ecole des Mines de Nantes, AtlanMod group

I wish to extend this invitation for an open position I received, with request of reposting and dissemination.

The AtlanMod research team (Inria, Mines Nantes, LINA) in Nantes (http://www.emn.fr/x-info/atlanmod) is hiring a full professor on an Inria chair to take the lead of the team and create a new Inria project-team in the future.
At AtlanMod they are looking for a high-profile researcher in the area of Modeling/MDE and its various applications, with experience in international research projects.
The working language of the team is English, so non-French speakers are also welcome.
You can find and download the complete position description from http://www.mines-nantes.fr/en/Media/Elements-RH/Full-professor-on-a-Mines-Nantes-Inria-chair-in-Software-Engineering-and-Model-Driven-Engineering

Deadline is very soon (by the end of this week) so please quickly contact Carole Menetrot (carole.menetrot@mines-nantes.fr) to get your application file!
In addition, feel free to contact anyone from the team for more information (or use our dedicated atlanmod-contact@mines-nantes.fr).


To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).

Monday, October 26, 2015

An Empirical Study on Simplification of Business Process Modeling Languages

Today I gave my presentation of our Empirical Study on Simplification of Business Process Modeling Languages at the Conference of Software Language Engineering, in Pittsburg, PA (co-located with Splash 2015).

You can find the full presentation here below, and some more details in this post by Eric Umuhoza on Jordi Cabot's blog.


  

The work is based on the fact that the adaptation, specially by means of a simplification process, of modeling languages is a common practice due to the overwhelming complexity of most standard languages (like UML or BPMN), not needed for typical usage scenarios while at the same time companies don't want to go to the extremes of defining a brand new domain specific language.

Unfortunately, there is a lack of examples of such simplification experiences that can be used as a reference for future projects. In this paper we report on a field study aimed at the simplification of a business process modeling language (namely, BPMN) for making it suitable to end users.

Our simplification process relies on a set of steps that encompass the selection of the language elements to simplify, generation of a set of language variants for them, measurement of effectiveness of the variants through user modeling sessions and extraction of quantitative and qualitative data for guiding the selection of the best language refinement, as shown here:


We describe the experimental setting, the output of the various steps of the analysis, and the results we obtained from users. Finally, we conclude with an outlook towards the generalization of the approach and consolidation of a language simplification method.
Out of this, you can also find an overview on how these results have been used by Fluxedo, a startup around a mobile app for social task planning.

To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).