## Abstract

Although artificial intelligence (AI) systems which support composition using predictive text are well established, there are no analogous technologies for mechanical design. Motivated by the vision of a predictive system that learns from previous designs and can interactively provide a list of established feature alternatives to the designer as design progresses, this paper describes the theory, implementation, and assessment of an intelligent system that learns from a family of previous designs and generates inferences using a form of spatial statistics. The formalism presented models 3D design activity as a “marked point process” that enables the probability of specific features being added at particular locations to be calculated. Because the resulting probabilities are updated every time a new feature is added, the predictions will become more accurate as a design develops. This approach allows the cursor position on a CAD model to implicitly define a spatial focus for every query made to the statistical model. The authors describe the mathematics underlying a statistical model that amalgamates the frequency of occurrence of the features in the existing designs of a product family. Having established the theoretical foundations of the work, a generic six-step implementation process is described. This process is then illustrated for circular hole features using a statistical model generated from a dataset of hydraulic valves. The paper describes how the positions of each design’s extracted hole features can be homogenized through rotation and scaling. Results suggest that within generic part families (i.e., designs with common structure), a marked point process can be effective at predicting incremental steps in the development of new designs.

## 1 Introduction

It has been argued that only 20% of design information is reused despite 90% of all design activities being based on the variants of existing designs [1], and on average only 28% of design information is reused within manufacturing applications [2]. Design can be considered as a sequential decision-making process, where the current state of a design evolves through a series of design choices. A system is required where design features may be suggested to the designer for effective reuse, and these design reuse procedures can be learned from historical data [3,4].

This paper introduces the underpinning mathematics required for implementation of a new generation of user interfaces that automatically identifies appropriate characteristics of previous designs for reuse based on a designer’s real-time activity. As a design evolves, the system generates predictions of the features which might be incorporated and are informed by both previous work and the new, ongoing design. In order to identify the most relevant features and avoid presenting the user with an overwhelming number of suggestions, the work reported exploits the location of information (i.e., features and mouse pointer) on a 3D computer-aided design (CAD) model so that predictions can be appropriate to specific positions on an engineering component. The system described assumes a single engineer developing a design by carrying out a series of operations on a CAD system. The system does not dictate any order of operations and allows the engineer’s focus to move around the component.

Designs seldom start with a blank sheet of paper but are informed by past experiences with reports of as much as 75% of design activity comprising the reuse of existing knowledge [5]. In the context of designing industrial parts, such activities comprise re-using, configuring, and assembling existing components. Several metrics have been developed to quantify the levels of commonality and reuse among families of similar products [6–8]. A key contributing factor to companies not performing projects on time and budget is the lack of knowledge reuse, which leads to frequent “reinventing the wheel” rather than finding and using already known solutions [9].

Motivated by these observations, this paper proposes a different form of design representation that can combine many design variations into a single probabilistic model that facilitates the reuse of previously used features during an interactive process that leads to the instantiation of a new design. By leveraging the available information, a probabilistic CAD system would prompt the engineers with fragments (i.e., features) of previously designed components to extend the current CAD design. Although reuse of common features in the design of many industrial products is desirable, there could be cases where such a practice inhibits innovation. Aware of this, the authors’ aim is not to automate but support with suggestions that the engineer is free to ignore. For this, we propose modeling the design process as a marked point process (MPP) to create a formal framework that can assess the association between designs. Similar approaches have been used successfully in neuroanatomy to analyze brain scan images through voxel-based morphometry [10], as well a feature recognition in image analysis.

For our application, points are the coordinate location corresponding to where a design feature has been placed and marks refer to the feature chosen. MPP is a form of spatial statistics, metrics based on statistical tools that are used to characterize the distribution of events across space [11], and are widely used across a number of application areas for example the distribution of trees in forests to stars in the sky. Through this lens, we view the behavior of engineers as a stochastic process, updating throughout the design process on decisions made to place features in specific locations and thereby supporting probabilistic measures for subsequent choices. The statistical inference can be supported through historical data, viewing past designs as realizations from such a stochastic process. Specifically, we develop a decision support system through a Bayesian methodology, where we start with a prior distribution to assign a probabilistic measure on the features and location to be chosen by the engineer. Following each choice, the prior distribution is updated to a posterior distribution based on this new data and thereby making full use of all the information available. So, as more design choices are made, the model will be able to discriminate more effectively between historical designs based on similarity.

Given the above context and motivation, the authors defined the following goals for the work:

### 1.1 Aim.

To define a computational framework that can support an interactive design process with suggestions of features based on three inputs: a knowledge of existing designs, the state of an emerging design, and a location on the surface of the emerging design.

### 1.2 Objectives.

Establish a method of homogenizing the orientation and dimensions of a collection of designs belonging to a product family.

Develop a statistical function that represents the probability of a particular feature occurring at a particular location on the surface of a design for a member of the product family.

Create a prototype implementation that can support an interactive design cycle which updates the inferred probability of specific features occurring at given location as the design of a part is modified.

Assess the accuracy of the feature predictions.

Identify any inherent limitations or weakness in the approach.

The rest of this paper is structured as follows: in Sec. 2, we provide a brief review of both predictive design systems and relevant MPP literature to position the contribution of this work. In Sec. 3, we present a generic overview of our process to support design development and in Sec. 4, we outline the details of the mathematical model that underpins the process. We explore key characteristics of design activities and data to inform our modeling choices, and we provide a generic modeling and decision support framework. In Sec. 5, we evaluate the proposed modeling framework through a case study. Finally, in Sec. 6, we reflect on the future direction of research in this area.

## 2 Literature Review

Systems for predictive design have to combine assessment of historical data with statistical methods so that human users can easily choose, or ignore, suggestions that enhances the creative process. sms text messaging software, used by mobile phones, illustrate both the potential and challenges of engineering useful predictive systems. However, while text prediction seek to identify patterns in a linear series of symbols with a simple (i.e., keypad) interface, anticipating the intention of product designer requires analysis of 3D information that has no canonical ordering (i.e., unlike a sentence of text that reads from left to right, a designer can essentially edit shapes in any sequence). Despite these inherent difficulties, research into computational technologies that could enable predictive design systems has been reported for more than a decade.

An early example is Ref. [12] where Chaudhuri and Koltun developed the “InspireMe” interface which allowed a user to “place” and “glue” one of the ten suggestions, proposed in response to a query shape, and then request new suggestions for the resulting composite shape. The placed shape can be translated, rotated, and scaled to match the query shape. The suggestions that are not useful can be removed and replaced with new suggestions. Chaudhuri and Koltun [12] used a multidimensional histogram-based signature to encode shape’s global spatial structure and its local detail to identify suggestions for a given shape query.

Later work recognized that there was potential to improve the accuracy of suggestions by combining the frequency of occurrence with shape parameter values. For example, Chaudhuri et al. [13] demonstrated an interface for an assembly based modeling tool. The interface presents the user with semantical labeled tabs that can be expanded hierarchically to show component sub-categories. The user can select a component and drag it onto the current model. A probabilistic Bayesian network is then used to dynamically update both the proposed component categories and the components based on their semantic and stylistic compatibility with the current modeling state. The interface estimates whether the new component should have a symmetric counterpart and computes the symmetry plane. Based on the modeling requirements, the selected model can be moved to a position, rotated, scaled, duplicated, and glued.

While Chaudhuri and Koltun [12] focused on the design of assemblies of predefined component parts, Kalogerakis et al. [14] reported a predictive system for component shape synthesis. Their approach provided an interactive platform for the user to constrain shape synthesis based on high-level specifications (i.e., specific components, components from particular categories, and components from learned latent styles) and an input shape database. Within the proposed interactive shape synthesis interface, a user can select constraints by selecting required shape styles, component categories, and styles. The algorithm then proposes a list of synthesized objects based on the given inputs. The discrete features help ensure that components selected for a synthesized shape have compatible numbers of adjacent components of each type, and their edges have been identified and stored with the category label of components (so they can be attached for placing where a component can be attached to another component with symmetry relationships). Like Chaudhuri and Koltun [12], Kalogerakis et al. [14] also used a probabilistic approach to identify and synthesize existing shapes from complex domains to generate new combinations of components.

There is a tension in all reported work between accuracy and the number of predictions made. This can be observed in Ref. [15] that describes a user interface that guides a designer’s selection with a list of 50 “best” suggested components during an assembly based modeling process. The interface aims to enable easy browsing and propose components that are most compatible with the current state of the assembly design (represented as a 3D model). The interface allows users to manually drag, move, scale, orient, and combine selected components. The placed components can also be incorporated in the design using Boolean operations (union, difference, and intersection) to obtain composite model. The suggestion list automatically updates every time a component is added to the assembly. The suggestions are ranked by size (larger components are given preference) at the start of the modeling process. The marginal probability distribution is computed from a factor graph by Jaiswal et al. [15], which incorporates adjacency and multiplicity factors of segmented components, to score and rank predicted components.

A different type of assembly design is considered by Lam et al. [16] who proposed an algorithm that takes a partially completed 3D scene as input and propose relevant models in a user-specified region of interest by leveraging text data. Suggestions are generated using three different approaches: graph kernel, N-gram, and merged. A query is generated by converting the given 3D scene into text that represents the five closest models to a focal point nominated by the user. The algorithm uses co-occurrence, 5-gram statistics from Google Web N-grams dataset, and point-wise mutual information (MI) between the labels of nearby models in the scene and the labels of models in the database to create suggestions.

For a very similar application [17] presented, a method for generating novel arrangements of diverse 3D objects is synthesized from few given examples. The method creates a probabilistic model for scenes based on Bayesian networks and Gaussian mixtures that can be trained by a small number of input examples of relevant scenes retrieved from database. Users were able to vary the degrees of similarity and diversity in the generated scenes by controlling the weighting (through blending parameters) given to the influence of the existing database of prior designs.

The “AttribIt” interface was developed in Ref. [18] which facilitates the targeted exploration of different combinations of visual components using commands based on the relative semantic attribute. A user initializes a design with a coherent combination of components from a database, then they select a subset of these components and interactively increase or decrease the strength of an attribute using sliders. In doing this, they can observe changes to the whole design in real time as new database components corresponding to the updated attribute strengths are swapped. The components are assembled automatically into a coherent design (provision for manual adjustments such as translation, rotation, and scaling controls is available to refine the results). The interface shows regions of high geometric variation under the current attribute (highlighted in red color in Fig. 1).

When the overall form of a design (whether assembly or component) is constrained by function or the need to fit into a product family, a template can be used to facilitate reuse. For example, Schulz et al. [19] developed templates that can be used in an interactive design system to create new 3D models in a design-by-example manner. The interface allows a user to choose template parts from the database, change their parameters, and combine them to create new models. The information in the template has been used to automatically position, align, and connect parts by adjusting parameters, adding constraints, and assigning connectors. The assembly based modeling system provides pick and drag substructures from different designs and add them to a working model. The elements on the selected node are represented in full color, while others become semi-transparent during manipulation, and constrained degrees-of-freedom are hidden.

To support the generation of interior designs, Liu et al. [20] developed a probabilistic hierarchical grammar to enable functional (rather than spatial) representation of an office environment. The aim was to support consistent segmentations, category labels, and functional groupings of 3D scenes that characterizes geometric properties, cardinalities, and spatial relationship in a hierarchical manner. A probabilistic grammar is used to automatically create consistent annotated scene graphs. Figure 2 illustrates an input scene mapped with labels and then converted into the hierarchical form using probabilistic grammar. A seven-dimensional descriptor (i.e., support and vertical relationships, horizontal separation, and overlap between objects) is used to describe the relationship between two objects. The dynamic programming for belief propagation was developed for scene parsing with optimal hierarchy. The technique creates candidate nodes based on spatial proximity, grammar binaries, and finds the optimal binary hierarchy which is converted to a logical hierarchy of the original grammar.

More recently, Sung et al. [21] reported the “ComplementMe” user interface that aims to seamlessly integrate suggested CAD models into the design process. A combination of embedding and retrieval neural network architectures are proposed for suggesting complementary functional and stylistic components and their placements within an incomplete 3D part assembly. The embedding network was used to map parts to a low-dimensional feature space, and the retrieval network was used to retrieve partial assemblies to appropriate components. The interface shows the possible candidates generated by sampling from the conditional probability distribution predicted by the retrieval network. The user could select a desired complementary component, and the algorithm predicts the location for it via the placement network. The new shape will be synthesized for the user, and the next component is proposed based on the modified assembly.

In conclusion, the literature on predictive design systems is largely focused on the creation of assemblies of 3D component models where frequently the positioning of suggested components is a manual task for the user. In contrast, the authors’ work is focused on the identification of shape features (i.e., fragments of an entire model that are patterns of geometry such as holes) that are appropriate to a location defined by the position of a user’s mouse pointer on the surface of a 3D object.

### 2.1 Marked Point Processes.

MPPs are widely applied within image analysis, where it was first introduced by Baddeley and Lieshout [22]. The methodology is used extensively and successfully for the extraction of multiple objects from images. Applications include biological imagery on cells [23], disks in a plane [24], building outlines [25], and person detection from camera images [26]. It is a flexible methodology that has been extended for object extraction from images to arbitrarily shaped objects [27]. More recently, Kim et al. [28] have developed the approach for microscope images, Zhao et al. [29] have used MPPs to automatically detect the locations of road segments, and Mbarki and Naouai [30] have used it for visual perceptions. A survey of MPPs applied to image analysis can be found in Ref. [31].

The literature to date has developed methods to extract images and characterize them in the form of a MPP which are then stored in a database. Our focus complements this work, as we develop decision support tools that also utilizes information about the location of extracted features in an MPP data structure.

## 3 Process for Constructing Marked Point Process Decision Support

We propose a six-step approach adapted from the CISSE process, see Ref. [32], for constructing empirical prior distributions to support Bayesian analysis, which considers the following five steps: *characterize*, *identify*, *sentence*, *select*, and *estimate*. As described in the following for the third step, we have placed particular focus on homogenizing the data rather than sentencing the data and we have decomposed the fifth step to consider prediction and updating.

*Step 1:**Characterize the population of designs*. We begin by identifying those factors characterizing the design. This is an important step because it defines the criteria by which data sets (i.e., historical designs) are subsequently selected for inclusion in the comparator pool used to construct the prior distribution. Examples of such characteristics may be with respect to types of layouts of and/or features used within designs.*Step 2:**Identify candidate sample designs matching population*. The factors characterizing the population of designs can be switched on/off for candidate designs effectively providing a means of making a relative assessment of relevance against a set of criteria. We are simply trying to find the best available datasets to make reasonable and timely inference. We are assuming that the current design for which we are providing the decision support will be similar to one of these historical designs. We can accommodate a unique a priori assessment on the likelihood of the current design being realized to be like each possible candidate historical design, although our default may be a uniform distribution prior.*Step 3:**Homogenizing the comparator data*. Generally, the higher the degree of homogeneity within the comparator pool, the more accurate the predictive inference [33]. This requires a measure for similarity between designs, such as the Kullback–Leibler (KL) divergence measure as proposed in Ref. [34] against which the data can be transformed for homogeneity. Two key approaches to address this are scaling and rotation. First, all designs can be re-scaled into the unit cube. Second, the data describing the locations of features can be rotated for alignment. This work should be performed prior to the start of the design. This stage may be omitted if it is considered that information would be lost in transforming the data, and the resulting prior would not be as effective at discriminating between design types.*Step 4:**Select a probability model for the population of designs*. The nature of design patterns is such that a parametric probability distribution is unlikely to exist that adequately represent the variability of location and features within designs. As such, a non-parametric approach should be considered, for which we recommend kernel density estimation (KDE). Under such an approach, choices will need to be made concerning the bandwidth parameter, which is essentially deciding on allowable variation of location of features within similar designs. The resulting model is known as the feature location probability function (FLPF), for which we would fit one to each historic design to obtain a model for each design type.*Step 5:**Predictive model*. The predictive distribution is simply a weighted average of the FLPF for each design type in the comparator set, where the weights reflect the likelihood that the current design will ultimately be realized as being similar to the candidate design in the set.*Step 6:**Update prior on design type and predictive distribution*. During the design process, steps 5 and 6 are repeated in a cycle of feature addition and updating of the predictive distribution, which we call the predictive feature location function (PFLF), to reflect how each change impacts on the probable location of other features. This process, driven by the actions and selections of the human designer, continues until the component part is complete (i.e., the design is finished).

Figure 3 provides a schematic for the predictive system. The data homogenization and FLPF can be performed in advance using the existing designs and features selected from the database in steps 1 and 2. As a new design evolves, the PFLF is generated from the FLPF and the design type prior. The PFLF can be updated in response to events to provide feature suggestions at interactive speeds.

## 4 Model Development

In this section, a model is mathematically developed for steps 3–6 from the process in Sec. 3. This will allow for both predictions on feature type together with its spatial position.

### 4.1 Overview.

We model the process of a designer choosing to place features in specific locations as a MPP. As such, we can view historical designs as a realization from this process. Consider a design denoted by *d*_{i}, which comprises *n*_{i} features (not necessarily unique) and for each feature, which we denote with *m*, we have an associated location described by its (*x*, *y*) coordinates. We express the design as an unordered set of coordinates and features with $di={(x1,y1,m1),\u2026,(xni,yni,mni)}$. We restrict our designs to two dimensions expressed as (*x*, *y*) coordinates for simplicity but the method is easily generalizable to higher dimensions.

*x*,

*y*) coordinates, for feature

*m*given the designer has already made choices of places various features at locations captured in the matrix $c\u223c$. A characteristic of this function is that if we integrate the intensity over the whole (

*x*,

*y*) plane, then we obtain the expected number of features

*m*in the design. Moreover, we can express this as a probability density function, i.e., $f(x,y,m,|c\u223c)$ given in Eq. (1), to describe the next choice made by the designer by normalizing it so that it integrates to 1. This function can then be used to rank features based on their likelihood of being placed at specific locations to provide appropriate decision support to the designer.

Engineering designs possess dependency structures unlike other fields of MPP study, so “off the shelf” models for intensity functions are not available. Dependency refers to the association of choices, such that placing one feature in a location increases or decreases the likelihood of other features in various locations. Poor choices of dependency models can result in uninformative inference at best and misleading inference at worst. In typical spatial or temporal point process applications, self-exciting models are used to capture local dependency where the realization of one point increases the likelihood of nearby points being discovered. In design, choosing a feature for a location can have ramifications for distant locations due to a need for symmetry for example. We develop a methodology for characterizing such dependency.

Many designs may be a collection of few choices, so while there may exist a large database of historical designs, there are small sample sizes on which to infer the dependency structure. Inference is made more challenging with an extensive set of features from which to choose.

We propose a non-parametric approach to estimating the intensity functions that will provide a foundation on which to develop decision support, estimated from the data on historical designs. KDEs consist of modeling the intensity function of a point process through assigning a kernel, e.g., the normal distribution, centered at each location where a point has been realized, often resulting in a multi-modal probability model to describe the likelihood of discovering points. Typically, the kernel density requires the analyst to choose a value for the smoothing parameter (e.g., in the case of the normal kernel density, this would correspond to the standard deviation for each density used).

In Sec. 4.2, we will develop the non-parametric model for the density function based on KDE from historical designs. In Secs. 4.3 and 4.4, we will outline a Bayesian updating mechanism that will show how the density function changes as the designer makes further choices and as such so too will the decision support. In Sec. 4.5, we will derive metrics to characterize the dependency structure implied by these modeling assumptions. Finally in Sec. 4.6, we will consider transformation that we can make on historical design data to improve predictions, specifically, rescaling and rotating the data.

### 4.2 Model Description.

We assume that a new design will be similar in some sense to historical designs but not necessarily identical. As such, prior to commencing, an assessment should be made of the historical data that will be used to assess its suitability. Assuming we have a catalog of *n* historical designs that are appropriate for the decision support, then we consider that there are *n* types of design and the current design under construction will belong to one of these types. We will estimate the density function for each type with the data available from each design. Following this, we will apply a prior probability on the type of design being constructed based on the choices made.

*i*for which there have been

*n*

_{i,m}choices of feature

*m*. Using a KDE approach to estimate the probability density function for design of type

*i*with respect to feature

*m*, we have the density given in Eq. (2)

*ϕ*

_{j}(·) is a bivariate normal density function,

*μ*

_{x,j}is the mean of the

*x*variable in

*ϕ*

_{j}(·),

*μ*

_{y,j}is the mean of the

*y*variable in

*ϕ*

_{j}(·),

*x*

_{i,m,j}is the location on the

*x*coordinate in design

*i*of the

*j*th occurrence of feature

*m*,

*y*

_{i,m,j}is the location on the

*y*coordinate in design

*i*of the

*j*th occurrence of feature

*m*,

*σ*is the standard deviation for both

*x*and

*y*, although one could assume a more elaborate covariance structure if appropriate, and

*c*

_{xy}is a normalizing constant to ensure the density integrates to 1. It is worth noting that one could substitute other kernel density functions in if more appropriate, we only require it to possess all the characteristics of a bivariate probability density function.

We have assigned a uniform distribution over the plane for situations where that feature has not appeared in design *i*. It may be desirable to remove this, if one did not want to permit certain features for particular design types.

Essentially, the resulting density is a collection of normal densities centered about observed locations and the standard deviation parameter controls for the allowable variation from the historical design to be considered similar.

*I*be the random variable describing the design type that the designer is developing and

*M*to be the random variable describing the next feature to be chosen. To express the unconditional probability density function, we first define three indicator functions to denote design type, feature, and presence in Eq. (3).

*m*appearing in design type

*i*with

*p*

_{i,m}and the probability of the design being of type

*i*with

*π*(

*i*). Combining these, the full probability density function describing the likelihood of a feature

*m*being located at (

*x*,

*y*) and the design being of type

*i*is given in Eq. (4).

*I*, we obtain the distribution for location and feature only, given in Eq. (5).

*g*

_{j}(·) is a univariate normal density function.

### 4.3 Probability of a Feature Being Selected for a Design.

Given the total number of incidences within a design, we assume the number of incidences of each possible feature for a design is a realization from a multinomial distribution. Moreover, we assume that the underlying probabilities associated with each feature vary across design types. Under such a modeling assumption, a natural estimator of the probability of a feature being selected for a design of a particular type would be the observed frequency on similar designs from the class. However, given that we have at most one design for each type, we are likely to produce poor inference due to small samples. Moreover, we are likely to be faced with a large number of features with zero events data resulting in an estimated probability of 0. This creates a particular issue for the decision support being developed, as all historical designs that did not possess all the features chosen for a current design would be ruled out as candidate design types through Bayesian updating. As such, allowing for non-zero probability estimates would permit the inclusion of candidate design types even if they do not include all the features chosen at some point in the design process. For a discussion on alternative estimation methods for zero event data, see Ref. [35].

*w*=

*β*/(

*β*+

*k*

_{i}),

*k*

_{i,z}gives the number of features in design

*i*of type

*z*, $ki=\u2211\u2200zki,z$, $\beta z=\u2211\u2200iki,z$, and $\beta =\u2211\u2200z\beta z$.

We see that *p*_{i,z} is a weighted average of the observed frequency *k*_{i,z}/*k*_{i} and the prior mean. The weight applied to the frequency increases as the number of features chosen for design *i* increases, i.e., *k*_{i}.

### 4.4 Bayesian Updating.

*π*(

*i*), which describes the uncertainty concerning the design type. In this section, we present a Bayesian updating of this distribution based on design choices. Assume that the designer has made

*n*

_{k}choices, then the posterior distribution for the design type is updated as in Eq. (11).

This function can be used to provide inference on the relative likelihood of features being located on specified positions through comparing ratios.

### 4.5 Dependency Structure.

The moments of the model are easily obtained through conditional expectation arguments resulting in the expectations given in Appendix A. Through setting $\delta nim=1$ for all designs, we would obtain the moments anticipated in the historical designs, however, for our model we have accommodated the possibility of features appearing in design types which are not present in the associated historical design.

*x*,

*y*) coordinates. However, commonly used such measures are limited within our context as they focus on the linear relationship between only two variables. We may wish to consider more general settings such as non-linear relationships as well as 3D designs or even the dependency between features and locations. For this, we use the MI measure, which we denote by

*ω*, to assess dependency. The concept of MI is linked to the entropy of a random variable, which quantifies the expected amount of information held in a random variable. The MI measure is considering the information gain from modeling the joint distribution rather than assuming each variable is independent.

Clearly, *ω* = 0 if *f*(*x*, *y*, *m*) = *f*(*x*)*f*(*y*)*f*(*m*), i.e., if the variables are independent. Moreover, it can be shown that as dependency increases so too does the measure. This measure can be useful for comparing dependency between various subsets of designs, noting that the stronger the dependency, the better the predictions will be. Some analysts prefer to transform this measure to bound it within (0, 1) and as such use the transform $\omega \u02d9=1\u2212e\u22122\omega $ [36]. The joint and marginal distributions required to calculate the MI are provided in Appendix B.

### 4.6 Re-scaling and Rotating.

Generally, the higher the degree of homogeneity in the comparator pool of data, then the greater the accuracy in the prediction [33] and as such pre-processing the relevant historical data to achieve greater homogeneity may be desirable. We consider re-scaling and rotating the data for each as a means to achieve this. However, such transformations may not always be beneficial as key information may be lost that helps identify the most similar historical designs. The advantage of such transformations is through identifying regions where specific features are highly likely to be located for a large number of design types. The disadvantage can be blurring distinctive characteristics between design types and as such it will take longer for the process to learn precisely to which design type it belongs.

*R*shown in Eq. (15).

*R*, we obtain the new coordinates as in Eq. (16).

An analyst could decide upon rotation and rescaling based on visual inspection. However, for a more rigorous approach, we would need to measure the distance between designs and seek to minimize it. Following the approach proposed by Vasantha et al. [34], we use the KL divergence measure to assess the difference between designs. Using the superimposition of all designs as an average design, we can measure the difference of each design to the average and seek to minimize it.

*v*to

*u*, denoted by

*D*

_{KL}(

*D*

_{u}‖

*D*

_{v}), is given in Eq. (17).

*u*. Expressing this as an expectation provides a computational advantage; as a closed form analytical solution is not available, we can conduct Monte Carlo simulations with the distribution of

*u*and evaluate the average of the expression. In sum, we can transform each design through rotation and re-scaling to minimize the KL divergence of the mean design to the design in question.

### 4.7 Summary.

Section 4 has outlined the underlying model and process to support the prediction of features given location. This can be used with an interactive CAD system, where the cursor sits in a location described by its coordinates and the recommended feature is suggested. In Sec. 5, we apply this to a dataset.

## 5 Case Study

To allow an intuitive, visual understanding of the proposed process, we have chosen to use a set of 513 mechanical valve designs. The structure of the valve bodies has obvious regularities with circles around the valve’s flanges together with other functional holes. An unordered set of hole diameters and associated (*x*, *y*, *z*) coordinates were extracted for each valve body from the B-rep of the CAD design using the twig match algorithm [37]. Further details are provided in Ref. [34]. An example valve design is shown in Fig. 4(a) with the extracted hole features, scaled to [0, 1], shown in Fig. 4(b).

In this analysis, the aim is to predict the sequential addition of hole features and their position given the state of the current design, with the focus on features occurring on the same surface plane, i.e., predicting a hole diameter on the flange surface.

### 5.1 Scaling.

To facilitate prediction, the feature coordinates of each design were scaled to the unit cube—each dimension was scaled to [0, 1]—and additional re-scaling was required on each cube surface so that the features retained their geometric shape. Models were then estimated using the features and feature positions which were positioned on the surface of the cube, one surface at a time. For example, after scaling, the data were subset to analyze the features on the *x* = 0 face. Figure 5 shows the superimposition of the scaled feature coordinates from all designs on the *x* = 0, *y* = 0, and *z* = 0 faces (some jitter of the points has been added to the figure to aid feature discrimination).

### 5.2 Kernel Density Estimation.

The KDE were estimated across each face of the cube. This was done by dividing the face into *N* by *N* regular grid positions and then estimating the kernel density at each grid position for each feature in all designs. A normal kernel with user-specified standard deviation was applied, in this analysis chosen to be 0.05, and the density on each dimension is calculated independently. If a design did not have a specific feature that was present in the database pool of features, then a uniform probability across the grid of positions was assumed. This allows for predictions to be generated on a new design which is using a combination of hole features that have not been previously observed. The KDE outputs a density estimation at each position in the *N* by *N* grid for every feature in the database.

### 5.3 Evaluation.

Both the correctness of the FLPF and the predictive accuracy of the PFLF were assessed using tenfold cross-validation. The kernel density across the features was estimated using the training data and then evaluated on the designs in the test set. Each test design, which contain hole feature labels and their coordinates, was evaluated one at a time. Three measures of predictive performance were used; the distance from the observed feature coordinate to the nearest predicted mode was calculated using two approaches, and reciprocal rank was used to evaluate how accurately a feature was predicted at its observed position. Further details are provided in the Supplemental Material S1.3 available in the Supplemental Materials on the ASME Digital Collection.

An illustration of the Bayesian updating and predictions on one test design is provided in the Supplemental Material S1 available in the Supplemental Materials on the ASME Digital Collection.

### 5.4 Results.

Figure 6 illustrates the aggregated results from the cross-validation. The *x*-axis indicates how many holes have been added to a new design (e.g., if a test design has four features, then the predictive densities, distances to mode, and ranks are calculated after sequentially adding 0, 1, 2, or 3 holes to the new design). The *y*-axis gives the mean distance to the mode, either on the raw scale or in grid steps or the mean reciprocal rank. The red triangle gives the mean across the ten folds. The performance of the predicted rank of suggestions is shown in the third figure; the range of values is from zero—poor suggestions—to one—perfect suggestions.

As expected, an initial improvement is observed in the distances to the nearest predictive mode as additional features were added to each new test design, however, there is a clear pattern of extreme values within all figures which results in a decrease in performance as additional holes are added to a design. This can be explained, as within each test fold there are a few designs which are unlike anything in the training set and thus the KDE does not provide reasonable predictions.

Some examples follow. All the designs in which a specific feature occurs in the training dataset may have a different number of feature instances than observed in the test design. In one example, within a training dataset instance, designs with the hole diameter “33.0” have between 3 and 13 instances on the flange plane, however it occurs 16 times within a test design. This results in all the predictive modes being slightly offset, as seen in the Supplemental Material S4.1 available in the Supplemental Materials on the ASME Digital Collection. This modeling framework cannot infer the coordinates for features even though they are still placed within the same circular orientation. Another example is that all of the designs with a specific feature in the training dataset are positioned differently than those in the test design. For an example shown in the Supplemental Material S4.2 available in the Supplemental Materials on the ASME Digital Collection, the hole diameter “35.0” was used as a central bore hole in the training data designs, however, it was used as the bolt connector within the new test design. Therefore, as additional holes are added to the new design, the updated predictive density provides little information. Clearly, the order in which the features are added to a new design will affect the predictive density, particularly when there are multiple types (hole dimensions) of feature, and this can impact the quality of predictive guidance. For an example shown in the Supplemental Material S4.3 available in the Supplemental Materials on the ASME Digital Collection, there are 30 designs within a training dataset instance that contain the “22.0” diameter, but only one of these also has the additional “17.29” diameter. The early selection of the “17.29” feature adds more probability weight onto the single design in the training set, and it takes several further additions for the predictions to improve.

The predictive performance of the method was re-evaluated omitting the 24 unusual designs from the test datasets and the results are shown in Fig. 7. This is done to examine the predictive performance of the model for a designer who remains within the catalog of previous designs. It can be seen that the predictive performance improves as more features are added to a new design. This again indicates that the utility of the method is dependent on the designs forming a homogeneous set. The folds with larger values can be explained by the ordering of the features entering in to the design, as illustrated by the example given in the Supplemental Material S4.3 available in the Supplemental Materials on the ASME Digital Collection. While the distance to the nearest predictive mode may be small, there remain extreme values in the rank predictions. This indicates that while a feature is expected at a position, our model has been unable to predict the specific feature, and so suggests that the feature added at this position is unusual given those observed at in the training data.

The prediction results for the features on the *y* = 0 and *z* = 0 cube face are provided in the Supplemental Material S5 available in the Supplemental Materials on the ASME Digital Collection. Performance is similar to the *x* = 0 face with predictions improving as additional features are added to the design. Within the features on these faces, there are two designs that are unlike any of the other designs; the effect of this is more apparent on the *y* = 0 face. Again removal of the outlier design resulted in improved statistics (results not provided).

### 5.5 Association Measure.

The utility of the method is supported by the homogeneity of the design database. Section 4.5 described how the MI could be used to provide some measure of the expected dependence in a database, however, there is no analytic solution to Eq. (14) for our non-parametric model. We therefore estimate this measure through a simulation exercise. We denote *i*_{max} as the number of designs in the database, *m*_{max} as the number of unique marks (features), and *n*_{i,m} as the number of marks of type *m* in design *i*. The probability of randomly choosing design *i* is given by *q*_{i} = 1/*i*_{max} and the probability of selecting mark *m* given that design *i* was selected is defined by $pi,m=ni,m/$$\u2211\u2200mni,m$.

For a given design database, a representative random sample of feature instances is generated, using the following steps:

Uniformly sample a design

*i*from the set of designs in the database with probability*q*_{i}.Randomly sample a feature type

*m*from design*i*with probability*p*_{i,m}.Sample a single instance of feature type

*m*, as there may be multiple instances of feature*m*within design*i*.Take a random sample from the normal kernel with mean at the (

*x*,*y*) coordinates.Repeat many times.

An estimate of the MI can then be calculated using the expressions given in Appendix B where the KDE of the designs in the database are evaluated at the sampled coordinates.

The dependence structure within our sample database was estimated using this method across the ten training cross-validation datasets, and the resulting MI scores had mean 1.43 and standard deviation 0.03. This equates to a scaled $\omega \u02d9=0.97$ indicating that there is strong dependence within the data and thus we would expect predictions to be good. For comparison, a null distribution was estimated for the statistic on the same training data, permuting the feature instance and generating the coordinates randomly from the uniform [0, 1] distribution, so that features were no longer aligned with specific designs or coordinates. This gave a MI of 0.62 (0.03), and scaled value of 0.84. A second smaller simulation of randomly generated designs and feature coordinates revealed that smaller samples produced higher MI. As sample size increased, then the MI decreased to zero, the theoretical value for independence. It would therefore be useful for practitioners to evaluate the MI on randomly permuted data to support interpretation of the MI score on the design database.

### 5.6 Rotation.

The more similar the designs in the database, the stronger the signal for making predictions. However, different designs may have been created with a different orientation. Section 4.6 described how the KL measure could be used to rotate one design to minimize the probabilistic differences between them. There is not an analytic solution to Eq. (18) for our non-parametric model but we can minimize the KL divergence between two designs *u* and *v*, *D*_{KL}(*D*_{u}‖*D*_{v}), by finding the angles of rotation that maximize the second term by a simulation design embedded in an optimization routine. As the KDE may be a noisy function, a global optimization routine should be used, although a brute force search is feasible in 2D.

The rotation can be implemented as follows. The design database contains feature instances that are assumed to be representative of the underlying orientation of the designs and we consider this as a single average design. Each design would then be orientated in turn to this average design, excluding the design getting rotated from the pool. First multiple random draws are simulated from the normal kernel, with means equal to the feature positions of the design to be rotated. Then the KL measure is calculated between the samples and the average design. The new design is then rotated and re-sampled until the KL measure is minimized. An illustration of the 2D rotation of a part is given in the Supplemental Material S3 available in the Supplemental Materials on the ASME Digital Collection.

### 5.7 Choice of Standard Deviation.

The choice of standard deviation (bandwidth) of the normal kernel determines how spread out the predicted density is around the training data observations. Figure 8 shows the predicted density for the same training data under different kernel standard deviations. There are two main approaches one could take in determining a suitable value for this parameter. It may be that prediction is the only desirable performance measure, and then through cross-validation exercises, an optimal value can be identified. Alternatively, with more emphasis on the prospective nature of this decision support, to facilitate the determination of designs that are similar to historical designs, this parameter can be used as a controlling lever, whereby small values will result in predictions that are very close to previous designs.

### 5.8 Discussion of Results.

We have described how to implement the proposed process for predicting the type and location of the features that might be added during an engineering design process. We evaluated the method on a dataset of real designs through a cross-validation process. In 90% of the evaluation runs, the feature’s actual location and the prediction (once at least one feature had been selected) were very close (i.e., within 0.5 grid space on average—1% of the normalized range of the part). When more features were added to the design, the accuracy of the predictions improved. This observation can be clearly seen in the ranking of the predicted features (i.e., an ordered list of the most to least likely features to occur at a given location). If four features had been selected (i.e., added to the design), the subsequent features selected were, on average, ranked in the first 25% of the list of suggestions. This increased to the top 10% once eight features had been selected.

This behavior reflects the nature of the commercial product families which formed the dataset. These have frequently repeated sets of features at standardized positions within a design, and so after one choice has been made then subsequent choices can be predicted with a high probability. In other words, portfolios of mechanical designs have strong dependence in data that results in strong predictive performance.

Figure 9 shows two components with a feature prediction “heat map” manually superimposed onto their faces. These hotspots indicate where features (regardless of their type and size) are frequently located in the training data. For each figure, the left-hand side plot presents the complete part, the central plot presents the predictive density from the training pool when no features have been added to the part, and the right-hand side plot the updated density given that the new features have been added to the design.

Figure 9(b) presents an example of how the predictive density changes once a feature has been added to the design. Before a choice has been made, the greatest predictive density is placed on the four corners of the feasible box, the normalized range of the scaled prediction region, indicating that this orientation is most common in the training dataset. Once a feature of specific size and position has been added, the predictive density changes to favor a pattern of six holes away from the corners. Red and blue circles are used in Fig. 9(b) to highlight how the predictive density in these regions changes given some feature addition. Although such heat maps give an intuitive overview, the prediction results can be presented to a user in several different ways. For example, given a location (e.g., the user’s curser), a list of feature suggestions (ranked in order of their likelihood) could be generated.

The heat maps also illustrate the need for further research into user interfaces that allow the designer to control the choice of training data (used to generate the predictions) and the scaling/mapping of the results onto new designs. In the case study, the feature coordinates were normalized to boundaries determined by the extent of feature locations within the training dataset. For example, in Fig. 9(a), predictions were generated across the unit cube and so there is non-zero density in the corners, highlighted by black circle in central panel of Fig. 9(a), whereas if a unit circle had been used to normalize the feature locations, the result would be more appropriate to the shape by omitting the truncated predicted region corners. An obvious artifact of the current approach to normalization is that there will be regions on a face (colored green in Fig. 9) that are beyond the geometric extent of the features used to train the prediction system. Due to the restrictions that we have imposed through this mapping, predictions were not generated outside the range of the normalized region. However, depending on the choice of kernel, one could extrapolate beyond these boundaries for a new design, but as with any extrapolation, these require stronger assumptions.

This could be mitigated by filtering suggestions that are physical or functionally feasible before presenting them as options to the designer. The development of effective filters would also enable the geometric limits on feature prediction to be determined in a manner most appropriate to the application (e.g., a part bounding box or specific planes). However, work is required on generic scaling functions to support this, for example, the top flange of the design-part provided in Fig. 9(b) is rectangular and so the predictive density requires to be mapped back to the original part dimension from the unit cube used to generate predictions.

## 6 Summary and Conclusions

The aim of this research was to “define a computational framework that can support an interactive design process with suggestions of features based on three inputs: a knowledge of existing designs; the state of an emerging design; and a location on the surface of the emerging design.” The authors believe that system described meets this goal and has established how the feature content of mechanical designs can be amalgamated and transformed into a likelihood function that defines the probability of particular design features occurring at specific locations on a model.

The work has not only demonstrated that the architecture of the proposed system is viable but also established that the computations can be done quickly enough to support a dynamic design process. For example, the prototype system can respond to a given mouse location at interactive speeds (i.e., ms) and consequently could support user interface functionality such as pop-up menus (customized to reflect likely feature types and parameter values) or even ghosting images of possible features onto the cursor location as it moves to particular locations. In this way, the engineer is free to ignore these selections in the same way a user of a predictive text system is able to adopt or dismiss suggestions when composing SMS texts.

The case study illustrated the method using hole features, however, the feature set could be extended to include those with a more complicated geometry. Provided that a feature can be defined geometrically and hence extracted from the CAD design, the prediction method can be applied by considering such features as another type of mark. This would allow for modeling the dependence both between and across feature types.

### 6.1 Limitations.

Like other predictive systems, there are inevitable limitations. Currently, the system can only predict the likelihood of features occurring within the volume defined by the maximum extent of the features extracted from the training dataset. Understanding how these results can be generalized to support predictions across variable volumes, as well as optimal scaling of the normalized prediction region, is an area of further research. Additionally, while the method of data homogenization appears to be viable for product families with very regular structures (e.g., industrial valves or manifold blocks), its behavior with product families with more variable forms is not clear.

However, one of the features of all interactive predictive systems (that makes them viable) is that the user is always free to ignore suggestions that are wrong or out of context. In other words, predictive systems do not have to provide perfect predictions all the time to be useful.

### 6.2 Future Work.

Having established the fundamentals of the theory, the authors intend to broaden the application to other datasets of mechanical component designs. This will allow the investigation of the methodology’s ability to support multiple feature types and more geometrically varied product families, i.e., the scaling of the normalized prediction region. The merits and implications of estimating the normalized prediction region using different kernels which can account for boundary effects will be studied. Considering MPP’s beyond simple Euclidean geometry provide opportunities. The current focus of the project has been on providing decision support to a single engineer, and how such a system will support concurrent designs carried out simultaneously by distributed teams is a topic that requires further investigation.

Although this work has established the theoretical and computational foundations for a predictive system, its utility will ultimately depend on how its user interface behaves. Although beyond the scope of this work follow-on projects will seek to incorporate the predictive functionality described in a commercial CAD system (via their Application Programming Interface (API)) and so allow a systematic assessment of the impact of predictive CAD on design productivity to be undertaken.

## Acknowledgment

This work was supported by the Engineering and Physical Sciences Research Council, UK [grant number EP/R004226/1]. The dataset of hole features extracted from the valve models is available.^{2} We are grateful for the suggestions from two anonymous referees which improved the paper.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The data and information that support the findings of this article are freely available.^{3} The authors attest that all data for this study are included in the paper.

## Footnotes

See Note ^{2}.

### Appendix A: Moments From KDE

### Appendix B: Mutual Information Distributions

*ω*as follows:

*s*′ random simulations of (

*x*,

*y*) locations and feature

*m*from

*f*(

*x*,

*y*,

*m*). As we increase the number of simulations, we can obtain a more accurate estimate of the expectation.

*x*,

*y*). For features, direct calculation of the entropy is straightforward.