Logo

    ddc:000

    Explore " ddc:000" with insightful episodes like "Exploiting autobiographical memory for fallback authentication on smartphones", "Cross-species network and transcript transfer", "Erfassung und Behandlung von Positionsfehlern in standortbasierter Autorisierung", "Regularization methods for item response and paired comparison models" and "Information-theoretic graph mining" from podcasts like ""Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02", "Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02", "Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02", "Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02" and "Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02"" and more!

    Episodes (52)

    Exploiting autobiographical memory for fallback authentication on smartphones

    Exploiting autobiographical memory for fallback authentication on smartphones
    Smartphones have advanced from simple communication devices to multipurpose devices that capture almost every single moment in our daily lives and thus contain sensitive data like photos or contact information. In order to protect this data, users can choose from a variety of authentication schemes. However, what happens if one of these schemes fails, for example, when users are not able to provide the correct password within a limited number of attempts? So far, situations like this have been neglected by the usable security and privacy community that mainly focuses on primary authentication schemes. But fallback authentication is comparably important to enable users to regain access to their devices (and data) in case of lockouts. In theory, any scheme for primary authentication on smartphones could also be used as fallback solution. In practice, fallback authentication happens less frequently and imposes different requirements and challenges on its design. The aim of this work is to understand and address these challenges. We investigate the oc- currences of fallback authentication on smartphones in real life in order to grasp the charac- teristics that fallback authentication conveys. We also get deeper insights into the difficulties that users have to cope with during lockout situations. In combination with the knowledge from previous research, these insights are valuable to provide a detailed definition of fall- back authentication that has been missing so far. The definition covers usability and security characteristics and depicts the differences to primary authentication. Furthermore, we explore the potential of autobiographical memory, a part of the human memory that relates to personal experiences of the past, for the design of alternative fall- back schemes to overcome the well-known memorability issues of current solutions. We present the design and evaluation of two static approaches that are based on the memory of locations and special drawings. We also cover three dynamic approaches that relate to re- cent smartphone activities, icon arrangements and installed apps. This series of work allows us to analyze the suitability of different types of memories for fallback authentication. It also helps us to extend the definition of fallback authentication by identifying factors that influence the quality of fallback schemes. The main contributions of this thesis can be summarized as follows: First, it gives essen- tial insights into the relevance, frequency and problems of fallback authentication on smart- phones in real life. Second, it provides a clear definition of fallback authentication to classify authentication schemes based on usability and security properties. Third, it shows example implementations and evaluations of static and dynamic fallback schemes that are based on different autobiographical memories. Finally, it discusses the advantages and disadvantages of these memories and gives recommendations for their design, evaluation and analysis in the context of fallback authentication.

    Cross-species network and transcript transfer

    Cross-species network and transcript transfer
    Metabolic processes, signal transduction, gene regulation, as well as gene and protein expression are largely controlled by biological networks. High-throughput experiments allow the measurement of a wide range of cellular states and interactions. However, networks are often not known in detail for specific biological systems and conditions. Gene and protein annotations are often transferred from model organisms to the species of interest. Therefore, the question arises whether biological networks can be transferred between species or whether they are specific for individual contexts. In this thesis, the following aspects are investigated: (i) the conservation and (ii) the cross-species transfer of eukaryotic protein-interaction and gene regulatory (transcription factor- target) networks, as well as (iii) the conservation of alternatively spliced variants. In the simplest case, interactions can be transferred between species, based solely on the sequence similarity of the orthologous genes. However, such a transfer often results either in the transfer of only a few interactions (medium/high sequence similarity threshold) or in the transfer of many speculative interactions (low sequence similarity threshold). Thus, advanced network transfer approaches also consider the annotations of orthologous genes involved in the interaction transfer, as well as features derived from the network structure, in order to enable a reliable interaction transfer, even between phylogenetically very distant species. In this work, such an approach for the transfer of protein interactions is presented (COIN). COIN uses a sophisticated machine-learning model in order to label transferred interactions as either correctly transferred (conserved) or as incorrectly transferred (not conserved). The comparison and the cross-species transfer of regulatory networks is more difficult than the transfer of protein interaction networks, as a huge fraction of the known regulations is only described in the (not machine-readable) scientific literature. In addition, compared to protein interactions, only a few conserved regulations are known, and regulatory elements appear to be strongly context-specific. In this work, the cross-species analysis of regulatory interaction networks is enabled with software tools and databases for global (ConReg) and thousands of context-specific (CroCo) regulatory interactions that are derived and integrated from the scientific literature, binding site predictions and experimental data. Genes and their protein products are the main players in biological networks. However, to date, the aspect is neglected that a gene can encode different proteins. These alternative proteins can differ strongly from each other with respect to their molecular structure, function and their role in networks. The identification of conserved and species-specific splice variants and the integration of variants in network models will allow a more complete cross-species transfer and comparison of biological networks. With ISAR we support the cross-species transfer and comparison of alternative variants by introducing a gene-structure aware (i.e. exon-intron structure aware) multiple sequence alignment approach for variants from orthologous and paralogous genes. The methods presented here and the appropriate databases allow the cross-species transfer of biological networks, the comparison of thousands of context-specific networks, and the cross-species comparison of alternatively spliced variants. Thus, they can be used as a starting point for the understanding of regulatory and signaling mechanisms in many biological systems.

    Erfassung und Behandlung von Positionsfehlern in standortbasierter Autorisierung

    Erfassung und Behandlung von Positionsfehlern in standortbasierter Autorisierung
    Durch die immer größeren technischen Möglichkeiten mobiler Endgeräte sind die Voraussetzungen erfüllt, um diese zum mobilen Arbeiten oder zur Steuerung von industriellen Fertigungsprozessen einzusetzen. Aus Gründen der Informations- und Betriebssicherheit, sowie zur Umsetzung funktionaler Anforderungen, ist es aber vielfach erforderlich, die Verfügbarkeit von entsprechenden Zugriffsrechten auf Nutzer innerhalb autorisierter Zonen zu begrenzen. So kann z.B. das Auslesen kritischer Daten auf individuelle Büros oder die mobile Steuerung von Maschinen auf passende Orte innerhalb einer Fabrikhalle beschränkt werden. Dazu muss die Position des Nutzers ermittelt werden. Im realen Einsatz können Positionsschätzungen jedoch mit Fehlern in der Größe von autorisierten Zonen auftreten. Derzeit existieren noch keine Lösungen, welche diese Fehler in Autorisierungsentscheidungen berücksichtigen, um einhergehenden Schaden aus Falschentscheidungen zu minimieren. Ferner existieren derzeit keine Verfahren, um die Güteeigenschaften solcher Ortsbeschränkungen vor deren Ausbringung zu analysieren und zu entscheiden, ob ein gegebenes Positionierungssystem aufgrund der Größe seiner Positionsfehler geeignet ist. In der vorliegenden Arbeit werden deshalb Lösungen zur Erfassung und Behandlung solcher Positionsfehler im Umfeld der standortbasierten Autorisierung vorgestellt. Hierzu wird zunächst ein Schätzverfahren für Positionsfehler in musterbasierten Positionierungsverfahren eingeführt, das aus den Charakteristika der durchgeführten Messungen eine Verteilung für den Standort des Nutzers ableitet. Um hieraus effizient die Aufenthaltswahrscheinlichkeit innerhalb einer autorisierten Zone zu bestimmen, wird ein Algorithmus vorgestellt, der basierend auf Vorberechnungen eine erhebliche Verbesserung der Laufzeit gegenüber der direkten Berechnung erlaubt. Erstmals wird eine umfassende Gegenüberstellung von existierenden standortbasierten Autorisierungsstrategien auf Basis der Entscheidungstheorie vorgestellt. Mit der risikobasierten Autorisierungsstrategie wird eine neue, aus entscheidungstheoretischer Sicht optimale Methodik eingeführt. Es werden Ansätze zur Erweiterung klassischer Zugriffskontrollmodelle durch Ortsbeschränkungen vorgestellt, welche bei ihrer Durchsetzung die Möglichkeit von Positionsfehlern und die Konsequenzen von Falschentscheidungen berücksichtigen. Zur Spezifikation autorisierter Zonen werden Eigenschaftsmodelle eingeführt, die, im Gegensatz zu herkömmlichen Polygonen, für jeden Ort die Wahrscheinlichkeit modellieren, dort eine geforderte Eigenschaft zu beobachten. Es werden ferner Methoden vorgestellt, um den Einfluss von Messausreißern auf Autorisierungsentscheidungen zu reduzieren. Ferner werden Analyseverfahren eingeführt, die für ein gegebenes Szenario eine qualitative und quantitative Bewertung der Eignung von Positionierungssystemen erlauben. Die quantitative Bewertung basiert auf dem entwickelten Konzept der Autorisierungsmodelle. Diese geben für jeden Standort die Wahrscheinlichkeit an, dort eine Positionsschätzung zu erhalten, die zur Autorisierung führt. Die qualitative Bewertung bietet erstmals ein binäres Kriterium, um für ein gegebenes Szenario eine konkrete Aussage bzgl. der Eignung eines Positionierungssystems treffen zu können. Die Einsetzbarkeit dieses Analyseverfahrens wird an einer Fallstudie verdeutlicht und zeigt die Notwendigkeit einer solchen Analyse bereits vor der Ausbringung von standortbasierter Autorisierung. Es wird gezeigt, dass für typische Positionierungssysteme durch die entwickelten risikobasierten Verfahren eine erhebliche Reduktion von Schaden aus Falschentscheidungen möglich ist und die Einsetzbarkeit der standortbasierten Autorisierung somit verbessert werden kann.

    Information-theoretic graph mining

    Information-theoretic graph mining
    Real world data from various application domains can be modeled as a graph, e.g. social networks and biomedical networks like protein interaction networks or co-activation networks of brain regions. A graph is a powerful concept to model arbitrary (structural) relationships among objects. In recent years, the prevalence of social networks has made graph mining an important center of attention in the data mining field. There are many important tasks in graph mining, such as graph clustering, outlier detection, and link prediction. Many algorithms have been proposed in the literature to solve these tasks. However, normally these issues are solved separately, although they are closely related. Detecting and exploiting the relationship among them is a new challenge in graph mining. Moreover, with data explosion, more information has already been integrated into graph structure. For example, bipartite graphs contain two types of node and graphs with node attributes offer additional non-structural information. Therefore, more challenges arise from the increasing graph complexity. This thesis aims to solve these challenges in order to gain new knowledge from graph data. An important paradigm of data mining used in this thesis is the principle of Minimum Description Length (MDL). It follows the assumption: the more knowledge we have learned from the data, the better we are able to compress the data. The MDL principle balances the complexity of the selected model and the goodness of fit between model and data. Thus, it naturally avoids over-fitting. This thesis proposes several algorithms based on the MDL principle to acquire knowledge from various types of graphs: Info-spot (Automatically Spotting Information-rich Nodes in Graphs) proposes a parameter-free and efficient algorithm for the fully automatic detection of interesting nodes which is a novel outlier notion in graph. Then in contrast to traditional graph mining approaches that focus on discovering dense subgraphs, a novel graph mining technique CXprime (Compression-based eXploiting Primitives) is proposed. It models the transitivity and the hubness of a graph using structure primitives (all possible three-node substructures). Under the coding scheme of CXprime, clusters with structural information can be discovered, dominating substructures of a graph can be distinguished, and a new link prediction score based on substructures is proposed. The next algorithm SCMiner (Summarization-Compression Miner) integrates tasks such as graph summarization, graph clustering, link prediction, and the discovery of the hidden structure of a bipartite graph on the basis of data compression. Finally, a method for non-redundant graph clustering called IROC (Information-theoretic non-Redundant Overlapping Clustering) is proposed to smartly combine structural information with non-structural information based on MDL. IROC is able to detect overlapping communities within subspaces of the attributes. To sum up, algorithms to unify different learning tasks for various types of graphs are proposed. Additionally, these algorithms are based on the MDL principle, which facilitates the unification of different graph learning tasks, the integration of different graph types, and the automatic selection of input parameters that are otherwise difficult to estimate.

    Datenerfassung und Privatsphäre in partizipativen Sensornetzen

    Datenerfassung und Privatsphäre in partizipativen Sensornetzen
    Partizipative Sensornetze (PSNs) stellen eine neue Art von Sensornetzen dar, die auf Basis von freiwillig zur Verfügung gestellten Mobiltelefonen etabliert werden. Sie ermöglichen eine großflächige Erfassung von Messdaten im direkten Umfeld von Menschen und können für zahlreiche Anwendungsszenarien verwendet werden. Neben ihren Vorzügen bringen PSNs aber auch Schwierigkeiten mit sich. Zwei zentrale Herausforderungen sind die ressourcenschonende Datenerfassung und der Schutz der Privatsphäre – beide resultieren aus der Instrumentalisierung privater Mobiltelefone zur Datenerfassung. Da der primäre Verwendungszweck der Geräte nicht die Aufzeichnung von Messdaten ist, darf diese deren Ressourcen nicht merklich belasten. Außerdem muss sichergestellt werden, dass durch die Erfassung von Messdaten die Privatsphäre der teilnehmenden Nutzer nicht verletzt wird. Der erste Teil der Arbeit beschäftigt sich mit dem Aspekt der ressourcenschonenden Datenerfassung. Zunächst werden PSNs betrachtet, bei denen punktuell Messungen durchgeführt werden. Bei diesen Netzen müssen die teilnehmenden Geräte über die durchzuführenden Messungen unterrichtet werden. Damit hierbei die Ressourcen der Endgeräte nicht unnötig belastet werden, wird ein Konzept vorgestellt, das einerseits eine robuste Verteilung der Messaufgaben sicherstellt, gleichzeitig jedoch versucht, die Energieressourcen der Mobiltelefone zu schonen. Bei PSNs mit großflächiger und kontinuierlicher Datenerfassung spielt die Verteilung der Messaufgaben keine so entscheidende Rolle. Hier muss vielmehr sichergestellt werden, dass die Energie- und die Übertragungskosten auf Seiten der Nutzer möglichst gering bleiben. Aus diesem Grund wird ein Ansatz zur lokalen Gruppierung von Messknoten beschrieben, der durch eine Aufteilung der anfallenden Aufgaben und eine intelligente Auswahl der Knoten zu einer ressourcenschonenden und weniger redundanten Datenerfassung führt. Der zweite Teil der Arbeit befasst sich mit dem Schutz der Privatsphäre der Teilnehmer und beinhaltet zwei Themenblöcke. Zum einen wird ein Ansatz zur automatisierten Erzeugung von Privatsphäre-Zonen vorgestellt, der ohne das Eingreifen der Nutzer die Zonen an das jeweilige Umfeld anpasst. Diese Zonen werden um die vom Nutzer häufig besuchten Orte erstellt und verhindern so mögliche, auf der Identifikation dieser Orte basierende Deanonymisierungsangriffe. Zum anderen wird ein Kalibrierungssystem für PSNs beschrieben, dessen Fokus sowohl auf der Verbesserung der Datenqualität als auch auf der Wahrung der Privatsphäre der Nutzer liegt. Hierfür ermöglicht dieses eine rückwirkende Anpassung bereits übertragener Daten, verhindert aber gleichzeitig durch eine Modifikation der Kalibrierungsparameter und der Upload-Zeitpunkte eine direkte Zuordnung zu einem Nutzer.

    Estimation and model selection for dynamic biomedical images

    Estimation and model selection for dynamic biomedical images
    Compartment models are a frequently used tool for imaging data gained with medical and biological imaging techniques. The solutions of the differential equations derived from a compartment model provide nonlinear parametric functions, based on which the behavior of a concentration of interest over time can be described. Often, the number of compartments in a compartment model is unknown. As the model complexity itself, which is, the number of compartments, is certainly an important information, it is desirable to estimate it from the observed data. Additionally, the unknown parameters have to be estimated. Therefore, methods dealing with both the parameter estimation and model selection in compartment models are needed. The methods proposed in this thesis are motivated by two applications from the field of medical and biological imaging. In the first application, the quantitative analysis of Fluorescence recovery after photobleaching (FRAP) data, compartment models are used in order to gain insight into the binding behavior of molecules in living cells. As a first approach, we developed a Bayesian nonlinear mixed-effects model for the analysis of a series of FRAP images. Mixed-effect priors are defined on the parameters of the nonlinear model, which is a novel approach. With the proposed model, we get parameter estimates and additionally gain information about the variability between nuclei, which has not been studied so far. The proposed method was evaluated on half-nucleus FRAP data, also in comparison with different kinds of fixed-effects models. As a second approach, a pixelwise analysis of FRAP data is proposed, where information from the neighboring pixels is included into the nonlinear model for each pixel. This is innovative as the existing models are suitable for the analysis of FRAP data for some regions of interest only. For the second application, the quantitative analysis of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) of the breast, we use a compartment model which describes the exchange of blood between different, well-mixed compartments. In the analysis of such data, the number of compartments allows conclusions about the heterogeneity of cancerous tissue. Therefore, an estimation and model selection approach based on boosting, with which the number of compartments and the unknown parameters can be estimated at the voxel level, is proposed. In contrast to boosting for additive regression, where smoothing approaches are used, boosting in nonlinear parametric regression as described in this thesis is a novel approach. In an extension of this approach, the spatial structure of an image is taken into account by penalizing the differences in the parameter estimates of neighboring voxels. The evaluation of the method was done in simulation studies, as well as in the application to data from a breast cancer study. The majority of the program code used in the three approaches was newly developed in the programming languages R and C. Based on that code, two R packages were built.

    General methods for fine-grained morphological and syntactic disambiguation

    General methods for fine-grained morphological and syntactic disambiguation
    We present methods for improved handling of morphologically rich languages (MRLS) where we define MRLS as languages that are morphologically more complex than English. Standard algorithms for language modeling, tagging and parsing have problems with the productive nature of such languages. Consider for example the possible forms of a typical English verb like work that generally has four four different forms: work, works, working and worked. Its Spanish counterpart trabajar has 6 different forms in present tense: trabajo, trabajas, trabaja, trabajamos, trabajáis and trabajan and more than 50 different forms when including the different tenses, moods (indicative, subjunctive and imperative) and participles. Such a high number of forms leads to sparsity issues: In a recent Wikipedia dump of more than 400 million tokens we find that 20 of these forms occur only twice or less and that 10 forms do not occur at all. This means that even if we only need unlabeled data to estimate a model and even when looking at a relatively common and frequent verb, we do not have enough data to make reasonable estimates for some of its forms. However, if we decompose an unseen form such as trabajaréis `you will work', we find that it is trabajar in future tense and second person plural. This allows us to make the predictions that are needed to decide on the grammaticality (language modeling) or syntax (tagging and parsing) of a sentence. In the first part of this thesis, we develop a morphological language model. A language model estimates the grammaticality and coherence of a sentence. Most language models used today are word-based n-gram models, which means that they estimate the transitional probability of a word following a history, the sequence of the (n - 1) preceding words. The probabilities are estimated from the frequencies of the history and the history followed by the target word in a huge text corpus. If either of the sequences is unseen, the length of the history has to be reduced. This leads to a less accurate estimate as less context is taken into account. Our morphological language model estimates an additional probability from the morphological classes of the words. These classes are built automatically by extracting morphological features from the word forms. To this end, we use unsupervised segmentation algorithms to find the suffixes of word forms. Such an algorithm might for example segment trabajaréis into trabaja and réis and we can then estimate the properties of trabajaréis from other word forms with the same or similar morphological properties. The data-driven nature of the segmentation algorithms allows them to not only find inflectional suffixes (such as -réis), but also more derivational phenomena such as the head nouns of compounds or even endings such as -tec, which identify technology oriented companies such as Vortec, Memotec and Portec and would not be regarded as a morphological suffix by traditional linguistics. Additionally, we extract shape features such as if a form contains digits or capital characters. This is important because many rare or unseen forms are proper names or numbers and often do not have meaningful suffixes. Our class-based morphological model is then interpolated with a word-based model to combine the generalization capabilities of the first and the high accuracy in case of sufficient data of the second. We evaluate our model across 21 European languages and find improvements between 3% and 11% in perplexity, a standard language modeling evaluation measure. Improvements are highest for languages with more productive and complex morphology such as Finnish and Estonian, but also visible for languages with a relatively simple morphology such as English and Dutch. We conclude that a morphological component yields consistent improvements for all the tested languages and argue that it should be part of every language model. Dependency trees represent the syntactic structure of a sentence by attaching each word to its syntactic head, the word it is directly modifying. Dependency parsing is usually tackled using heavily lexicalized (word-based) models and a thorough morphological preprocessing is important for optimal performance, especially for MRLS. We investigate if the lack of morphological features can be compensated by features induced using hidden Markov models with latent annotations (HMM-LAs) and find this to be the case for German. HMM-LAs were proposed as a method to increase part-of-speech tagging accuracy. The model splits the observed part-of-speech tags (such as verb and noun) into subtags. An expectation maximization algorithm is then used to fit the subtags to different roles. A verb tag for example might be split into an auxiliary verb and a full verb subtag. Such a split is usually beneficial because these two verb classes have different contexts. That is, a full verb might follow an auxiliary verb, but usually not another full verb. For German and English, we find that our model leads to consistent improvements over a parser not using subtag features. Looking at the labeled attachment score (LAS), the number of words correctly attached to their head, we observe an improvement from 90.34 to 90.75 for English and from 87.92 to 88.24 for German. For German, we additionally find that our model achieves almost the same performance (88.24) as a model using tags annotated by a supervised morphological tagger (LAS of 88.35). We also find that the German latent tags correlate with morphology. Articles for example are split by their grammatical case. We also investigate the part-of-speech tagging accuracies of models using the traditional treebank tagset and models using induced tagsets of the same size and find that the latter outperform the former, but are in turn outperformed by a discriminative tagger. Furthermore, we present a method for fast and accurate morphological tagging. While part-of-speech tagging annotates tokens in context with their respective word categories, morphological tagging produces a complete annotation containing all the relevant inflectional features such as case, gender and tense. A complete reading is represented as a single tag. As a reading might consist of several morphological features the resulting tagset usually contains hundreds or even thousands of tags. This is an issue for many decoding algorithms such as Viterbi which have runtimes depending quadratically on the number of tags. In the case of morphological tagging, the problem can be avoided by using a morphological analyzer. A morphological analyzer is a manually created finite-state transducer that produces the possible morphological readings of a word form. This analyzer can be used to prune the tagging lattice and to allow for the application of standard sequence labeling algorithms. The downside of this approach is that such an analyzer is not available for every language or might not have the coverage required for the task. Additionally, the output tags of some analyzers are not compatible with the annotations of the treebanks, which might require some manual mapping of the different annotations or even to reduce the complexity of the annotation. To avoid this problem we propose to use the posterior probabilities of a conditional random field (CRF) lattice to prune the space of possible taggings. At the zero-order level the posterior probabilities of a token can be calculated independently from the other tokens of a sentence. The necessary computations can thus be performed in linear time. The features available to the model at this time are similar to the features used by a morphological analyzer (essentially the word form and features based on it), but also include the immediate lexical context. As the ambiguity of word types varies substantially, we just fix the average number of readings after pruning by dynamically estimating a probability threshold. Once we obtain the pruned lattice, we can add tag transitions and convert it into a first-order lattice. The quadratic forward-backward computations are now executed on the remaining plausible readings and thus efficient. We can now continue pruning and extending the lattice order at a relatively low additional runtime cost (depending on the pruning thresholds). The training of the model can be implemented efficiently by applying stochastic gradient descent (SGD). The CRF gradient can be calculated from a lattice of any order as long as the correct reading is still in the lattice. During training, we thus run the lattice pruning until we either reach the maximal order or until the correct reading is pruned. If the reading is pruned we perform the gradient update with the highest order lattice still containing the reading. This approach is similar to early updating in the structured perceptron literature and forces the model to learn how to keep the correct readings in the lower order lattices. In practice, we observe a high number of lower updates during the first training epoch and almost exclusively higher order updates during later epochs. We evaluate our CRF tagger on six languages with different morphological properties. We find that for languages with a high word form ambiguity such as German, the pruning results in a moderate drop in tagging accuracy while for languages with less ambiguity such as Spanish and Hungarian the loss due to pruning is negligible. However, our pruning strategy allows us to train higher order models (order > 1), which give substantial improvements for all languages and also outperform unpruned first-order models. That is, the model might lose some of the correct readings during pruning, but is also able to solve more of the harder cases that require more context. We also find our model to substantially and significantly outperform a number of frequently used taggers such as Morfette and SVMTool. Based on our morphological tagger we develop a simple method to increase the performance of a state-of-the-art constituency parser. A constituency tree describes the syntactic properties of a sentence by assigning spans of text to a hierarchical bracket structure. developed a language-independent approach for the automatic annotation of accurate and compact grammars. Their implementation -- known as the Berkeley parser -- gives state-of-the-art results for many languages such as English and German. For some MRLS such as Basque and Korean, however, the parser gives unsatisfactory results because of its simple unknown word model. This model maps unknown words to a small number of signatures (similar to our morphological classes). These signatures do not seem expressive enough for many of the subtle distinctions made during parsing. We propose to replace rare words by the morphological reading generated by our tagger instead. The motivation is twofold. First, our tagger has access to a number of lexical and sublexical features not available during parsing. Second, we expect the morphological readings to contain most of the information required to make the correct parsing decision even though we know that things such as the correct attachment of prepositional phrases might require some notion of lexical semantics. In experiments on the SPMRL 2013 dataset of nine MRLS we find our method to give improvements for all languages except French for which we observe a minor drop in the Parseval score of 0.06. For Hebrew, Hungarian and Basque we find substantial absolute improvements of 5.65, 11.87 and 15.16, respectively. We also performed an extensive evaluation on the utility of word representations for morphological tagging. Our goal was to reduce the drop in performance that is caused when a model trained on a specific domain is applied to some other domain. This problem is usually addressed by domain adaption (DA). DA adapts a model towards a specific domain using a small amount of labeled or a huge amount of unlabeled data from that domain. However, this procedure requires us to train a model for every target domain. Instead we are trying to build a robust system that is trained on domain-specific labeled and domain-independent or general unlabeled data. We believe word representations to be key in the development of such models because they allow us to leverage unlabeled data efficiently. We compare data-driven representations to manually created morphological analyzers. We understand data-driven representations as models that cluster word forms or map them to a vectorial representation. Examples heavily used in the literature include Brown clusters, Singular Value Decompositions of count vectors and neural-network-based embeddings. We create a test suite of six languages consisting of in-domain and out-of-domain test sets. To this end we converted annotations for Spanish and Czech and annotated the German part of the Smultron treebank with a morphological layer. In our experiments on these data sets we find Brown clusters to outperform the other data-driven representations. Regarding the comparison with morphological analyzers, we find Brown clusters to give slightly better performance in part-of-speech tagging, but to be substantially outperformed in morphological tagging.

    Grasp-sensitive surfaces

    Grasp-sensitive surfaces
    Grasping objects with our hands allows us to skillfully move and manipulate them. Hand-held tools further extend our capabilities by adapting precision, power, and shape of our hands to the task at hand. Some of these tools, such as mobile phones or computer mice, already incorporate information processing capabilities. Many other tools may be augmented with small, energy-efficient digital sensors and processors. This allows for graspable objects to learn about the user grasping them - and supporting the user's goals. For example, the way we grasp a mobile phone might indicate whether we want to take a photo or call a friend with it - and thus serve as a shortcut to that action. A power drill might sense whether the user is grasping it firmly enough and refuse to turn on if this is not the case. And a computer mouse could distinguish between intentional and unintentional movement and ignore the latter. This dissertation gives an overview of grasp sensing for human-computer interaction, focusing on technologies for building grasp-sensitive surfaces and challenges in designing grasp-sensitive user interfaces. It comprises three major contributions: a comprehensive review of existing research on human grasping and grasp sensing, a detailed description of three novel prototyping tools for grasp-sensitive surfaces, and a framework for analyzing and designing grasp interaction: For nearly a century, scientists have analyzed human grasping. My literature review gives an overview of definitions, classifications, and models of human grasping. A small number of studies have investigated grasping in everyday situations. They found a much greater diversity of grasps than described by existing taxonomies. This diversity makes it difficult to directly associate certain grasps with users' goals. In order to structure related work and own research, I formalize a generic workflow for grasp sensing. It comprises *capturing* of sensor values, *identifying* the associated grasp, and *interpreting* the meaning of the grasp. A comprehensive overview of related work shows that implementation of grasp-sensitive surfaces is still hard, researchers often are not aware of related work from other disciplines, and intuitive grasp interaction has not yet received much attention. In order to address the first issue, I developed three novel sensor technologies designed for grasp-sensitive surfaces. These mitigate one or more limitations of traditional sensing techniques: **HandSense** uses four strategically positioned capacitive sensors for detecting and classifying grasp patterns on mobile phones. The use of custom-built high-resolution sensors allows detecting proximity and avoids the need to cover the whole device surface with sensors. User tests showed a recognition rate of 81%, comparable to that of a system with 72 binary sensors. **FlyEye** uses optical fiber bundles connected to a camera for detecting touch and proximity on arbitrarily shaped surfaces. It allows rapid prototyping of touch- and grasp-sensitive objects and requires only very limited electronics knowledge. For FlyEye I developed a *relative calibration* algorithm that allows determining the locations of groups of sensors whose arrangement is not known. **TDRtouch** extends Time Domain Reflectometry (TDR), a technique traditionally used for inspecting cable faults, for touch and grasp sensing. TDRtouch is able to locate touches along a wire, allowing designers to rapidly prototype and implement modular, extremely thin, and flexible grasp-sensitive surfaces. I summarize how these technologies cater to different requirements and significantly expand the design space for grasp-sensitive objects. Furthermore, I discuss challenges for making sense of raw grasp information and categorize interactions. Traditional application scenarios for grasp sensing use only the grasp sensor's data, and only for mode-switching. I argue that data from grasp sensors is part of the general usage context and should be only used in combination with other context information. For analyzing and discussing the possible meanings of grasp types, I created the GRASP model. It describes five categories of influencing factors that determine how we grasp an object: *Goal* -- what we want to do with the object, *Relationship* -- what we know and feel about the object we want to grasp, *Anatomy* -- hand shape and learned movement patterns, *Setting* -- surrounding and environmental conditions, and *Properties* -- texture, shape, weight, and other intrinsics of the object I conclude the dissertation with a discussion of upcoming challenges in grasp sensing and grasp interaction, and provide suggestions for implementing robust and usable grasp interaction.

    Experience Prototyping for Automotive Applications

    Experience Prototyping for Automotive Applications
    In recent years, we started to define our life through experiences we make instead of objectswe buy. To attend a concert of our favorite musician may be more important for us thanowning an expensive stereo system. Similarly, we define interactive systems not only by thequality of the display or its usability, but rather by the experiences we can make when usingthe device. A cell phone is primarily built for making calls and receiving text messages,but on an emotional level it might provide a way to be close to our loved ones, even thoughthey are far away sometimes. When designing interactive technology, we do not only haveto answer the question how people use our systems, but also why they use them. Thus,we need to concentrate on experiences, feelings and emotions arising during interaction.Experience Design is an approach focusing on the story that a product communicates beforeimplementing the system. In an interdisciplinary team of psychologists, industrial designers, product developers andspecialists in human-computer interaction, we applied an Experience Design process to theautomotive domain. A major challenge for car manufacturers is the preservation of theseexperiences throughout the development process. When implementing interactive systemsengineers rely on technical requirements and a set of constraints (e.g., safety) oftentimescontradicting aspects of the designed experience. To resolve this conflict, Experience Prototypingis an important tool translating experience stories to an actual interactive product. With this thesis I investigate the Experience Design process focusing on Experience Prototyping.Within the automotive context, I report on three case studies implementing threekinds of interactive systems, forming and following our approach. I implemented (1) anelectric vehicle information system called Heartbeat, communicating the state of the electricdrive and the batteries to the driver in an unobtrusive and ensuring way. I integrated Heartbeatinto the dashboard of a car mock-up with respect to safety and space requirements butat the same time holding on to the story in order to achieve a consistent experience. With (2)the Periscope I implemented a mobile navigation device enhancing the social and relatednessexperiences of the passengers in the car. I built and evaluated several experience prototypesin different stages of the design process and showed that they transported the designed experiencethroughout the implementation of the system. Focusing on (3) the experience offreehand gestures, GestShare explored this interaction style for in-car and car-to-car socialexperiences. We designed and implemented a gestural prototypes for small but effectivesocial interactions between drivers and evaluated the system in the lab and and in-situ study. The contributions of this thesis are (1) a definition of Experience Prototyping in the automotivedomain resulting from a literature review and my own work, showing the importanceand feasibility of Experience Prototyping for Experience Design. I (2) contribute three casestudies and describe the details of several prototypes as milestones on the way from a anexperience story to an interactive system. I (3) derive best practices for Experience Prototypingconcerning their characteristics such as fidelity, resolution and interactivity as well asthe evaluation in the lab an in situ in different stages of the process.

    Liquid decision making

    Liquid decision making
    In today’s business world, decisions have to be made on different levels, including strategic, tactical, and operational levels. Decisions on the strategic level are characterized by their complexity, longevity and impact. Such decisions can benefit from the participation of a large, diverse group of people as they contribute different background knowledge, perspectives, and evaluation criteria. Typically, such decisions need to be considered over a prolonged period of time as opinions may need to be formed or may change due to the availability of new information. The goal of people in group decision making situations is typically to achieve good decisions. A mechanism is thus desirable that is capable of addressing the aforementioned challenges and of producing a good decision. For this work, a decision is thought to be good if it is predominantly based on the sincere opinions of the participants. In this thesis, we investigate the market metaphor as a promising approach for group decision making. Markets are attributed with the capability of gathering and aggregating assessments from people in a single indicator, the price. They allow for a continued participation over a prolonged time, reversibility of one’s market position by repeated trading, and the usage of individual evaluation criteria. For investigating the application of the market metaphor to decision making, we develop LDM, a market-based approach for group decision making. There, we represent a pending decision as a market and the decision options as stocks. Participants then buy shares of their favored stocks and sell shares of the stocks they dislike. High demand leads to price increase whereas low prices are the result of low demand. The most favored decision options can be identified from the ranking of the stocks according to their prices. To support the achievement of a good decision, we model the market behavior of participants, devise design principles, identify suitable application scenarios, and determine appropriate functionalities for a market software. We furthermore devise the concept of market perturbations for uncovering the trading intentions of participants. We furthermore implement a web-based software prototype of LDM. It provides functionalities for decision making, market trading, user handling, information exchange, and market perturbations. Participants there trade their favored stocks using virtual play money. We test the LDM approach and its software prototype in an EU-funded project, in a lab study, in the selection of research proposals, and in a university seminar for scenario building.

    Penalized regression for discrete structures

    Penalized regression for discrete structures
    Penalisierte Regressionsmodelle stellen eine Möglichkeit dar die Selektion von Kovariablen in die Schätzung eines Modells zu integrieren. Penalisierte Ansätze eignen sich insbesondere dafür, komplexen Strukturen in den Kovariablen eines Modells zu berücksichtigen. Diese Arbeit beschäftigt sich mit verschiedenen Penalisierungsansätzen für diskrete Strukturen, wobei der Begriff "diskrete Struktur" in dieser Arbeit alle Arten von kategorialen Einflussgrößen, von effekt-modifizierenden, kategorialen Einflussgrößen sowie von gruppenspezifischen Effekten in hierarchisch strukturierten Daten bezeichnet. Ihnen ist gemein, dass sie zu einer verhältnismäßig großen Anzahl an zu schätzenden Koeffizienten führen können. Deswegen besteht ein besonderes Interesse daran zu erfahren, welche Kategorien einer Einflussgröße die Zielgröße beeinflussen, und welche Kategorien unterschiedliche beziehungsweise ähnliche Effekte auf die Zielgröße haben. Kategorien mit ähnlichen Effekten können beispielsweise durch fused Lasso Penalties identifiziert werden. Jedoch beschränken sich einige, bestehende Ansätze auf das lineare Modell. Die vorliegende Arbeit überträgt diese Ansätze auf die Klasse der generalisierten linearen Regressionsmodelle. Das beinhaltet computationale wie theoretische Aspekte. Konkret wird eine fused Lasso Penalty für effekt-modifizierende kategoriale Einflussgrößen in generalisierten linearen Regressionsmodellen vorgeschlagen. Sie ermöglicht es, Einflussgrößen zu selektieren und Kategorien einer Einflussgröße zu fusionieren. Gruppenspezifische Effekte, die die Heterogenität in hierarchisch strukturierten Daten berücksichtigen, sind ein Spezialfall einer solchen effekt-modifizierenden, kategorialen Größe. Hier bietet der penalisierte Ansatz zwei wesentliche Vorteile: (i) Im Gegensatz zu gemischten Modellen, die stärkere Annahmen treffen, kann der Grad der Heterogenität sehr leicht reduziert werden. (ii) Die Schätzung ist effizienter als im unpenalisierten Ansatz. In orthonormalen Settings können Fused Lasso Penalties konzeptionelle Nachteile haben. Als Alternative wird eine L0 Penalty für diskrete Strukturen in generalisierten linearen Regressionsmodellen diskutiert, wobei die sogenannte L0 "Norm" eine Indikatorfunktion für Argumente ungleich Null bezeichnet. Als Penalty ist diese Funktion so interessant wie anspruchsvoll. Betrachtet man eine Approximation der L0 Norm als Verlustfunktion wird im Grenzwert der bedingte Modus einer Zielgröße geschätzt.

    New challenges for interviewers when innovating social surveys

    New challenges for interviewers when innovating social surveys
    The combination of survey data with more objective information, such as administrative records, is a promising innovation within social science research. The advantages of such projects are manifold, but implementing them also bears challenges to be considered. For example, the survey respondents typically have to consent to the linking of external data sources and interviewers have to feel comfortable with this task. This dissertation investigates whether and to what extent the interviewers have an influence on the willingness of the respondents to participate in two new projects within the Survey of Health, Ageing and Retirement in Europe (SHARE). Both projects had the goal to reduce the burden for respondents and to increase the data quality by linking the survey data with additional, more objective data. Both linkages required the interviewers to collect respondents’ written consent during the interview. The starting point of this dissertation is the question of what influences respondents’ decisions to consent to link their survey answers with administrative data. Three different areas are considered: characteristics of the respondents, the interviewers, and the interaction between respondents and interviewers. The results suggest that although respondent and household characteristics are important, a large part of the variation is explained by the interviewers. However, the information available about interviewers in SHARE is limited to a few demographic characteristics. Therefore, it is difficult to identify key interviewer characteristics that influence the consent process. To close this research gap, a detailed interviewer survey was developed and implemented in SHARE. This survey covers four different dimensions of interviewer characteristics: interviewers’ attitudes, their own behavior, experiences in surveys and special measurements, and their expectations regarding their success. These dimensions are applied to several aspects of the survey process, such as unit or item nonresponse as well as the specific projects of the corresponding SHARE questionnaire. The information collected in the interviewer survey is then used to analyze interviewer effects on respondents’ willingness to consent to the collection of blood samples. Those samples are analyzed in a laboratory and the results linked with the survey data. Interviewers’ experience and their expectations are of special interest, because as these are two characteristics that can be influenced during interviewer training and selection. The results in this dissertation show that the interviewers have a considerable effect on respondents’ consent to the collection of biomarkers. Moreover, the information collected in the interviewer survey can explain most of the variance on the interviewer level. A motivation for linking survey data with more objective data is the assumption that survey data suffer from recall error. In the last step, the overlap of information collected in the survey and provided in the administrative records is used to analyze recall error in the year of retirement. The comparison of the two datasets shows that most of respondents remember the year they retired correctly. Nevertheless, a considerable proportion of respondents make recall errors. Characteristics can be identified which increase the likelihood of a misreport, However, the error seems to be unsystematic, meaning that no pattern of reporting the event of retirement too late or too early is found.

    A network QoS management architecture for virtualization environments

    A network QoS management architecture for virtualization environments
    Network quality of service (QoS) and its management are concerned with providing, guaranteeing and reporting properties of data flows within computer networks. For the past two decades, virtualization has been becoming a very popular tool in data centres, yet, without network QoS management capabilities. With virtualization, the management focus shifts from physical components and topologies, towards virtual infrastructures (VI) and their purposes. VIs are designed and managed as independent isolated entities. Without network QoS management capabilities, VIs cannot offer the same services and service levels as physical infrastructures can, leaving VIs at a disadvantage with respect to applicability and efficiency. This thesis closes this gap and develops a management architecture, enabling network QoS management in virtulization environments. First, requirements are dervied, based on real world scenarios, yielding a validation reference for the proposed architecture. After that, a life cycle for VIs and a taxonomy for network links and virtual components are introduced, to arrange the network QoS management task with the general management of virtualization environments and enabling the creation of technology specific adaptors for integrating the technologies and sub-services used in virtualization environments. The core aspect, shaping the proposed management architecture, is a management loop and its corresponding strategy for identifying and ordering sub-tasks. Finally, a prototypical implementation showcases that the presented management approach is suited for network QoS management and enforcement in virtualization environments. The architecture fulfils its purpose, fulfilling all identified requirements. Ultimately, network QoS management is one amongst many aspects to management in virtualization environments and the herin presented architecture shows interfaces to other management areas, where integration is left as future work.

    Multi-purpose exploratory mining of complex data

    Multi-purpose exploratory mining of complex data
    Due to the increasing power of data acquisition and data storage technologies, a large amount of data sets with complex structure are collected in the era of data explosion. Instead of simple representations by low-dimensional numerical features, such data sources range from high-dimensional feature spaces to graph data describing relationships among objects. Many techniques exist in the literature for mining simple numerical data but only a few approaches touch the increasing challenge of mining complex data, such as high-dimensional vectors of non-numerical data type, time series data, graphs, and multi-instance data where each object is represented by a finite set of feature vectors. Besides, there are many important data mining tasks for high-dimensional data, such as clustering, outlier detection, dimensionality reduction, similarity search, classification, prediction and result interpretation. Many algorithms have been proposed to solve these tasks separately, although in some cases they are closely related. Detecting and exploiting the relationships among them is another important challenge. This thesis aims to solve these challenges in order to gain new knowledge from complex high-dimensional data. We propose several new algorithms combining different data mining tasks to acquire novel knowledge from complex high-dimensional data: ROCAT (Relevant Overlapping Subspace Clusters on Categorical Data) automatically detects the most relevant overlapping subspace clusters on categorical data. It integrates clustering, feature selection and pattern mining without any input parameters in an information theoretic way. The next algorithm MSS (Multiple Subspace Selection) finds multiple low-dimensional subspaces for moderately high-dimensional data, each exhibiting an interesting cluster structure. For better interpretation of the results, MSS visualizes the clusters in multiple low-dimensional subspaces in a hierarchical way. SCMiner (Summarization-Compression Miner) focuses on bipartite graph data, which integrates co-clustering, graph summarization, link prediction, and the discovery of the hidden structure of a bipartite graph data on the basis of data compression. Finally, we propose a novel similarity measure for multi-instance data. The Probabilistic Integral Metric (PIM) is based on a probabilistic generative model requiring few assumptions. Experiments demonstrate the effectiveness and efficiency of PIM for similarity search (multi-instance data indexing with M-tree), explorative data analysis and data mining (multi-instance classification). To sum up, we propose algorithms combining different data mining tasks for complex data with various data types and data structures to discover the novel knowledge hidden behind the complex data.

    Density-based algorithms for active and anytime clustering

    Density-based algorithms for active and anytime clustering
    Data intensive applications like biology, medicine, and neuroscience require effective and efficient data mining technologies. Advanced data acquisition methods produce a constantly increasing volume and complexity. As a consequence, the need of new data mining technologies to deal with complex data has emerged during the last decades. In this thesis, we focus on the data mining task of clustering in which objects are separated in different groups (clusters) such that objects inside a cluster are more similar than objects in different clusters. Particularly, we consider density-based clustering algorithms and their applications in biomedicine. The core idea of the density-based clustering algorithm DBSCAN is that each object within a cluster must have a certain number of other objects inside its neighborhood. Compared with other clustering algorithms, DBSCAN has many attractive benefits, e.g., it can detect clusters with arbitrary shape and is robust to outliers, etc. Thus, DBSCAN has attracted a lot of research interest during the last decades with many extensions and applications. In the first part of this thesis, we aim at developing new algorithms based on the DBSCAN paradigm to deal with the new challenges of complex data, particularly expensive distance measures and incomplete availability of the distance matrix. Like many other clustering algorithms, DBSCAN suffers from poor performance when facing expensive distance measures for complex data. To tackle this problem, we propose a new algorithm based on the DBSCAN paradigm, called Anytime Density-based Clustering (A-DBSCAN), that works in an anytime scheme: in contrast to the original batch scheme of DBSCAN, the algorithm A-DBSCAN first produces a quick approximation of the clustering result and then continuously refines the result during the further run. Experts can interrupt the algorithm, examine the results, and choose between (1) stopping the algorithm at any time whenever they are satisfied with the result to save runtime and (2) continuing the algorithm to achieve better results. Such kind of anytime scheme has been proven in the literature as a very useful technique when dealing with time consuming problems. We also introduced an extended version of A-DBSCAN called A-DBSCAN-XS which is more efficient and effective than A-DBSCAN when dealing with expensive distance measures. Since DBSCAN relies on the cardinality of the neighborhood of objects, it requires the full distance matrix to perform. For complex data, these distances are usually expensive, time consuming or even impossible to acquire due to high cost, high time complexity, noisy and missing data, etc. Motivated by these potential difficulties of acquiring the distances among objects, we propose another approach for DBSCAN, called Active Density-based Clustering (Act-DBSCAN). Given a budget limitation B, Act-DBSCAN is only allowed to use up to B pairwise distances ideally to produce the same result as if it has the entire distance matrix at hand. The general idea of Act-DBSCAN is that it actively selects the most promising pairs of objects to calculate the distances between them and tries to approximate as much as possible the desired clustering result with each distance calculation. This scheme provides an efficient way to reduce the total cost needed to perform the clustering. Thus it limits the potential weakness of DBSCAN when dealing with the distance sparseness problem of complex data. As a fundamental data clustering algorithm, density-based clustering has many applications in diverse fields. In the second part of this thesis, we focus on an application of density-based clustering in neuroscience: the segmentation of the white matter fiber tracts in human brain acquired from Diffusion Tensor Imaging (DTI). We propose a model to evaluate the similarity between two fibers as a combination of structural similarity and connectivity-related similarity of fiber tracts. Various distance measure techniques from fields like time-sequence mining are adapted to calculate the structural similarity of fibers. Density-based clustering is used as the segmentation algorithm. We show how A-DBSCAN and A-DBSCAN-XS are used as novel solutions for the segmentation of massive fiber datasets and provide unique features to assist experts during the fiber segmentation process.

    Wrapper algorithms and their performance assessment on high-dimensional molecular data

    Wrapper algorithms and their performance assessment on high-dimensional molecular data
    Prediction problems on high-dimensional molecular data, e.g. the classification of microar- ray samples into normal and cancer tissues, are complex and ill-posed since the number of variables usually exceeds the number of observations by orders of magnitude. Recent research in the area has propagated a variety of new statistical models in order to handle these new biological datasets. In practice, however, these models are always applied in combination with preprocessing and variable selection methods as well as model selection which is mostly performed by cross-validation. Varma and Simon (2006) have used the term ‘wrapper-algorithm’ for this integration of preprocessing and model selection into the construction of statistical models. Additionally, they have proposed the method of nested cross-validation (NCV) as a way of estimating their prediction error which has evolved to the gold-standard by now. In the first part, this thesis provides further theoretical and empirical justification for the usage of NCV in the context of wrapper-algorithms. Moreover, a computationally less intensive alternative to NCV is proposed which can be motivated in a decision theoretic framework. The new method can be interpreted as a smoothed variant of NCV and, in contrast to NCV, guarantees intuitive bounds for the estimation of the prediction error. The second part focuses on the ranking of wrapper algorithms. Cross-study-validation is proposed as an alternative concept to the repetition of separated within-study-validations if several similar prediction problems are available. The concept is demonstrated using six different wrapper algorithms for survival prediction on censored data on a selection of eight breast cancer datasets. Additionally, a parametric bootstrap approach for simulating realistic data from such related prediction problems is described and subsequently applied to illustrate the concept of cross-study-validation for the ranking of wrapper algorithms. Eventually, the last part approaches computational aspects of the analyses and simula- tions performed in the thesis. The preprocessing before the analysis as well as the evaluation of the prediction models requires the usage of large computing resources. Parallel comput- ing approaches are illustrated on cluster, cloud and high performance computing resources using the R programming language. Usage of heterogeneous hardware and processing of large datasets are covered as well as the implementation of the R-package survHD for the analysis and evaluation of high-dimensional wrapper algorithms for survival prediction from censored data.

    Domänenübergreifende Anwendungskommunikation im IP-basierten Fahrzeugbordnetz

    Domänenübergreifende Anwendungskommunikation im IP-basierten Fahrzeugbordnetz
    In heutigen Premiumfahrzeugen kommunizieren bis zu 80 Steuergeräte über bis zu sechs verschiedene Vernetzungstechnologien. Dabei öffnet sich die Fahrzeugkommunikation nach außen: Das Fahrzeug kommuniziert auch mit dem Smartphone des Fahrers und dem Internet. Für die Kommunikation über verschiedene Anwendungsdomänen im Fahrzeug müssen heute Gateways eingesetzt werden, die zwischen den nicht-kompatiblen Protokollen übersetzen. Deswegen geht der Trend auch in der Fahrzeugkommunikation zum Internet Protocol (IP), das für technologie- und domänenübergreifende Kommunikation entwickelt wurde. Neben dem durchgängigen Protokoll auf der Vermittlungsschicht ist für die effiziente Entwicklung eines komplexen, verteilten Systems wie einem Fahrzeug auch eine entsprechende Kommunikationsmiddleware notwendig. Die Kommunikation in einem Fahrzeug stellt spezielle Anforderungen an die Kommunikationsmiddleware. Zum einen werden in Fahrzeugen unterschiedliche Kommunikationsparadigmen genutzt, beispielsweise signalbasierte und funktionsbasierte Kommunikation. Zum anderen können sich die Kommunikationspartner in einem Fahrzeug hinsichtlich ihrer Ressourcen und ihrer Komplexität erheblich unterscheiden. Keine existierende IP-basierte Kommunikationsmiddleware erfüllt die in der vorliegenden Arbeit identifizierten Anforderungen für den Einsatz im Fahrzeug. Ziel dieser Arbeit ist es daher, eine Kommunikationsmiddleware zu konzipieren, die für den Einsatz im Fahrzeug geeignet ist. Die vorgestellte Lösung sieht mehrere interoperable Ausprägungen der Middleware vor, die den Konflikt zwischen unterschiedlichen funktionalen Anforderungen einerseits und den sehr heterogenen Kommunikationspartnern andererseits auflösen. Ein weiterer elementarer Teil der Lösung ist die Umsetzung der im Fahrzeug erforderlichen Kommunikationsparadigmen. Das funktionsbasierte Paradigma wird durch einfache Remote Procedure Calls implementiert. Das signalbasierte Paradigma wird durch ein darauf aufbauendes Notification-Konzept implementiert. Somit wird eine stärker am aktuellen Informationsbedarf orientierte Umsetzung ermöglicht, als dies im heutigen Fahrzeugbordnetz durch das einfache Verteilen von Daten der Fall ist. Es wird gezeigt, dass sich prinzipiell beide Kommunikationsparadigmen durch einen einzigen Mechanismus abbilden lassen, der abhängig von den beteiligten Ausprägungen mit dynamischen oder nur statischen Daten operiert. Ein skalierbares Marshalling berücksichtigt darüber hinaus die unterschiedlichen Anforderungen der Anwendungen und die unterschiedliche Leistungsfähigkeit der beteiligten Steuergeräte. Hiermit wird die Kommunikation zwischen allen Anwendungen im IP-basierten Fahrzeugbordnetz durchgängig ermöglicht. Auf dieser Basis wird die Lösung um wichtige Systemdienste erweitert. Diese Dienste implementieren Funktionen, die nur in der Kooperation mehrerer Komponenten erbracht werden können oder kapseln allgemeine Kommunikationsfunktionalität zur einfachen Wiederverwendung. Zwei für die Anwendung im Fahrzeug wichtige Systemdienste werden prototypisch dargestellt: Ein Service-Management ermöglicht die Verwaltung von Diensten in unterschiedlichen Zuständen, ein Security-Management bildet Security-Ziele auf die bestmögliche Kombination von implementierten Security-Protokollen der beteiligten Kommunikationspartner ab. Diese Systemdienste sind selbst skalierbar und lassen sich damit an das Konzept unterschiedlicher Ausprägungen der Kommunikationsmiddleware anpassen. Durch Leistungsmessungen an den im Rahmen dieser Arbeit entstandenen Prototypen wird gezeigt, dass die konzipierte Kommunikationsmiddleware für den Einsatz auf eingebetteten Systemen im Fahrzeug geeignet ist. Der Versuchsaufbau orientiert sich an typischen Anwendungsfällen für die Fahrzeugkommunikation und verwendet Automotive-qualifizierte, eingebettete Rechenplattformen. Insbesondere wird nachgewiesen, dass mit dem beschriebenen Konzept auch leistungsschwache Steuergeräte ins System eingebunden werden können. Die IP-basierte Kommunikationsmiddleware ist damit auf allen relevanten Steuergeräten im Fahrzeug durchgängig einsetzbar.

    The cockpit for the 21st century

    The cockpit for the 21st century
    Interactive surfaces are a growing trend in many domains. As one possible manifestation of Mark Weiser’s vision of ubiquitous and disappearing computers in everywhere objects, we see touchsensitive screens in many kinds of devices, such as smartphones, tablet computers and interactive tabletops. More advanced concepts of these have been an active research topic for many years. This has also influenced automotive cockpit development: concept cars and recent market releases show integrated touchscreens, growing in size. To meet the increasing information and interaction needs, interactive surfaces offer context-dependent functionality in combination with a direct input paradigm. However, interfaces in the car need to be operable while driving. Distraction, especially visual distraction from the driving task, can lead to critical situations if the sum of attentional demand emerging from both primary and secondary task overextends the available resources. So far, a touchscreen requires a lot of visual attention since its flat surface does not provide any haptic feedback. There have been approaches to make direct touch interaction accessible while driving for simple tasks. Outside the automotive domain, for example in office environments, concepts for sophisticated handling of large displays have already been introduced. Moreover, technological advances lead to new characteristics for interactive surfaces by enabling arbitrary surface shapes. In cars, two main characteristics for upcoming interactive surfaces are largeness and shape. On the one hand, spatial extension is not only increasing through larger displays, but also by taking objects in the surrounding into account for interaction. On the other hand, the flatness inherent in current screens can be overcome by upcoming technologies, and interactive surfaces can therefore provide haptically distinguishable surfaces. This thesis describes the systematic exploration of large and shaped interactive surfaces and analyzes their potential for interaction while driving. Therefore, different prototypes for each characteristic have been developed and evaluated in test settings suitable for their maturity level. Those prototypes were used to obtain subjective user feedback and objective data, to investigate effects on driving and glance behavior as well as usability and user experience. As a contribution, this thesis provides an analysis of the development of interactive surfaces in the car. Two characteristics, largeness and shape, are identified that can improve the interaction compared to conventional touchscreens. The presented studies show that large interactive surfaces can provide new and improved ways of interaction both in driver-only and driver-passenger situations. Furthermore, studies indicate a positive effect on visual distraction when additional static haptic feedback is provided by shaped interactive surfaces. Overall, various, non-exclusively applicable, interaction concepts prove the potential of interactive surfaces for the use in automotive cockpits, which is expected to be beneficial also in further environments where visual attention needs to be focused on additional tasks.

    Building a semantic search engine with games and crowdsourcing

    Building a semantic search engine with games and crowdsourcing
    Semantic search engines aim at improving conventional search with semantic information, or meta-data, on the data searched for and/or on the searchers. So far, approaches to semantic search exploit characteristics of the searchers like age, education, or spoken language for selecting and/or ranking search results. Such data allow to build up a semantic search engine as an extension of a conventional search engine. The crawlers of well established search engines like Google, Yahoo! or Bing can index documents but, so far, their capabilities to recognize the intentions of searchers are still rather limited. Indeed, taking into account characteristics of the searchers considerably extend both, the quantity of data to analyse and the dimensionality of the search problem. Well established search engines therefore still focus on general search, that is, "search for all", not on specialized search, that is, "search for a few". This thesis reports on techniques that have been adapted or conceived, deployed, and tested for building a semantic search engine for the very specific context of artworks. In contrast to, for example, the interpretation of X-ray images, the interpretation of artworks is far from being fully automatable. Therefore artwork interpretation has been based on Human Computation, that is, a software-based gathering of contributions by many humans. The approach reported about in this thesis first relies on so called Games With A Purpose, or GWAPs, for this gathering: Casual games provide an incentive for a potentially unlimited community of humans to contribute with their appreciations of artworks. Designing convenient incentives is less trivial than it might seem at first. An ecosystem of games is needed so as to collect the meta-data on artworks intended for. One game generates the data that can serve as input of another game. This results in semantically rich meta-data that can be used for building up a successful semantic search engine. Thus, a first part of this thesis reports on a "game ecosystem" specifically designed from one known game and including several novel games belonging to the following game classes: (1) Description Games for collecting obvious and trivial meta-data, basically the well-known ESP (for extra-sensorial perception) game of Luis von Ahn, (2) the Dissemination Game Eligo generating translations, (3) the Diversification Game Karido aiming at sharpening differences between the objects, that is, the artworks, interpreted and (3) the Integration Games Combino, Sentiment and TagATag that generate structured meta-data. Secondly, the approach to building a semantic search engine reported about in this thesis relies on Higher-Order Singular Value Decomposition (SVD). More precisely, the data and meta-data on artworks gathered with the afore mentioned GWAPs are collected in a tensor, that is a mathematical structure generalising matrices to more than only two dimensions, columns and rows. The dimensions considered are the artwork descriptions, the players, and the artwork themselves. A Higher-Order SVD of this tensor is first used for noise reduction in This thesis reports also on deploying a Higher-Order LSA. The parallel Higher-Order SVD algorithm applied for the Higher-Order LSA and its implementation has been validated on an application related to, but independent from, the semantic search engine for artworks striven for: image compression. This thesis reports on the surprisingly good image compression which can be achieved with Higher-Order SVD. While compression methods based on matrix SVD for each color, the approach reported about in this thesis relies on one single (higher-order) SVD of the whole tensor. This results in both, better quality of the compressed image and in a significant reduction of the memory space needed. Higher-Order SVD is extremely time-consuming what calls for parallel computation. Thus, a step towards automatizing the construction of a semantic search engine for artworks was parallelizing the higher-order SVD method used and running the resulting parallel algorithm on a super-computer. This thesis reports on using Hestenes’ method and R-SVD for parallelising the higher-order SVD. This method is an unconventional choice which is explained and motivated. As of the super-computer needed, this thesis reports on turning the web browsers of the players or searchers into a distributed parallel computer. This is done by a novel specific system and a novel implementation of the MapReduce data framework to data parallelism. Harnessing the web browsers of the players or searchers saves computational power on the server-side. It also scales extremely well with the number of players or searchers because both, playing with and searching for artworks, require human reflection and therefore results in idle local processors that can be brought together into a distributed super-computer.
    Logo

    © 2024 Podcastworld. All rights reserved

    Stay up to date

    For any inquiries, please email us at hello@podcastworld.io