An Analysis of the Differences between Unit and Integration Tests

Trautsch, Fabian

von Fabian Trautsch

Dissertation

Datum der mündl. Prüfung:2019-04-08

Erschienen:2019-04-12

Betreuer:Prof. Dr. Jens Grabowski

Gutachter:Prof. Dr. Jens Grabowski

Gutachter:Prof. Dr. Marcus Baum

Gutachter:Prof. Dr. Ina Schieferdecker

Zum Verlinken/Zitieren: http://dx.doi.org/10.53846/goediss-7393

Dateien

Name:ediss_ftrautsch.pdf

Size:2.68Mb

Format:PDF

ViewOpen

Lizenzbestimmungen:

Zusammenfassung

Englisch

Context: In software testing, there are several concepts that were established over the years, including unit and integration testing. These concepts are defined in standards and used in software testing certifications, which underline their importance for research and industry. However, these concepts are decades old. Nowadays, we do not have any evidence that these concepts still apply for modern software systems. Objective: The purpose of this thesis is to evaluate, if the differences between unit and integration testing are still valid nowadays. To this aim, we analyze defined differences between these test levels to provide evidence, if these are still current in modern software. Method: We performed quantitative and qualitative analysis on differences between unit and integration tests. The quantitative analysis was performed via a case study including 27 Java and Python projects with more than 49000 tests. During this analysis we classified tests into unit and integration tests according to the definitions of the Institute of Electrical and Electronics Engineers (IEEE) and International Software Testing Qualification Board (ISTQB) and calculated several metrics for those tests. We then used these metrics to assess three differences between these levels. For the qualitative analysis we searched for relevant research literature, developer comments, and further information regarding differences be- tween unit and integration tests. The found resources are evaluated to gain an understanding of the research and industrial perspective on the differences, i.e., if they are existent and to which magnitude. Results: We found that more integration than unit tests are present in most projects, when classified according to the definitions of the IEEE and ISTQB. However, the exact numbers differ between these definitions. Based on the developer classification of tests, there is no significant difference in the number of unit and integration tests. Our quantitative analysis highlights that diverse defined differences are no longer existent. We found, that the defect types that are detected by both test types, do not differ from each other and that there are no significant differences in their execution time. However, we confirmed that unit tests are better able to pinpoint the source of a defect. Our qualitative analysis of research and industrial perspective shows, that both test types are executed automatically, that their test objectives mostly differ from each other, and that practitioners experienced that integration tests are more costly than unit tests. Conclusions: Our results suggest that the current definitions of unit and integration tests are outdated and need to be reconsidered as most of the differences are vanishing. One reason for this could be technological advancements in the area of software testing and software engineering. However, this needs to be further investigated.

Keywords: unit testing; integration testing; empirical software engineering; mining software repositories

Deutsch

Kontext: Im Gebiet des Softwaretestens wurden über die Jahre verschiedene Konzepte, wie Unit- und Integrationstests, etabliert. Diese Konzepte wurden in Standards definiert und werden auch heutzutage noch in Softwaretesten Zertifikaten benutzt. Dies unterstreicht ihre Wichtigkeit für die Industrie und Forschung. Allerdings sind diese Konzepte schon Jahrzehnte alt. Aktuell existiert keine Evidenz, ob diese Konzepte noch immer für moderne Software Systeme zutreffen. Ziel: Das Ziel dieser Arbeit ist die Evaluation, ob die Unterschiede zwischen Unit- und Integrationstests, wie sie in der Standardliteratur beschrieben werden, noch immer zutref- fen. Dazu analysieren wir die Unterschiede zwischen diesen beiden Testarten. Methode: Wir benutzen qualitative und quantitative Methoden in dieser Arbeit. Die quantitative Analyse umfasst die Ausführung einer Fallstudie mit 27 Java und Python Pro- jekten, welche insgesamt mehr als 49000 Tests beinhalten. Innerhalb dieser Analyse klas- sifizieren wir alle Tests in Unit- bzw. Integrationstests mittels der Definitionen der Institute of Electrical and Electronics Engineers (IEEE) und International Software Testing Quali- fication Board (ISTQB). Zudem berechnen wir mehrere Metriken für diese Tests, um die Unterschiede zu quantifizieren. Für die qualitative Analyse haben wir relevante Literatur, Entwicklerkommentare, und weitere Informationen die sich mit den Unterschieden zwi- schen Unit- und Integrationstests befassen, analysiert. Ergebnisse: Unsere Ergebnisse zeigen, dass mehr Integrations- als Unittests in aktuel- len Projekten vorhanden sind, wenn wir die Tests nach den Definitionen des IEEE und des ISTQB klassifizieren. Die exakte Anzahl hängt von der Definition ab. Wenn wir die Tests so klassifizieren wie ihre Entwickler, sind nicht mehr Integrations- als Unitttests vor- handen. Die quantitative Analyse hat gezeigt, dass die meisten in der Literatur genannten Unterschiede zwischen beiden Testarten für moderne Software nicht mehr zutreffen. Unsere Ergebnisse zeigen, dass Unit- und Integrationstests dieselben Arten von Fehlern entdecken und dass es keine Unterschiede in ihrer Ausführungszeit gibt. Allerdings konnten wir bestä- tigen, dass Unittests besser zur Lokalisierung von Fehlern geeignet sind. Unsere qualitative Analyse hat gezeigt, dass beide Testarten automatisch ausgeführt werden, ihr Testziel sich voneinander unterscheidet und das Entwickler Integrationstests als teurer wahrnehmen. Schlussfolgerung: Unsere Ergebnisse zeigen, dass viele Unterschiede zwischen Unit- und Integrationstests nicht mehr vorhanden sind. Dies suggeriert, dass die derzeit geltenden Definitionen von Unit- und Integrationstests nicht für moderne Software Systeme zutreffen. Ein Grund hierfür könnte die Evolution der Softwareentwicklung sein, welche durch die Verbesserung und Entwicklung von Softwaretesten-Werkzeugen vorangetrieben wird.

Statistik