CADSketchNet' - An Annotated Sketch Dataset For 3D CAD Model Retrieval With Deep Neural Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Computers & Graphics 99 (2021) 100–113

Contents lists available at ScienceDirect

Computers & Graphics


journal homepage: www.elsevier.com/locate/cag

Special Section on 3DOR 2021

‘CADSketchNet’ - An Annotated Sketch dataset for 3D CAD Model


Retrieval with Deep Neural Networks
Bharadwaj Manda a,∗, Shubham Dhayarkar a,1, Sai Mitheran b,1, V.K. Viekash b,1,
Ramanathan Muthuganapathy a
a
Indian Institute of Technology Madras, India
b
National Institute of Technology Tiruchirappalli, India

a r t i c l e i n f o a b s t r a c t

Article history: Ongoing advancements in the fields of 3D modelling and digital archiving have led to an outburst in the
Received 2 April 2021 amount of data stored digitally. Consequently, several retrieval systems have been developed depending
Revised 30 June 2021
on the type of data stored in these databases. However, unlike text data or images, performing a search
Accepted 2 July 2021
for 3D models is non-trivial. Among 3D models, retrieving 3D Engineering/CAD models or mechanical
Available online 8 July 2021
components is even more challenging due to the presence of holes, volumetric features, presence of sharp
Keywords: edges etc., which make CAD a domain unto itself. The research work presented in this paper aims at de-
Retrieval veloping a dataset suitable for building a retrieval system for 3D CAD models based on deep learning. 3D
Search CAD models from the available CAD databases are collected, and a dataset of computer-generated sketch
Dataset data, termed ‘CADSketchNet’, has been prepared. Additionally, hand-drawn sketches of the components
Deep Learning are also added to CADSketchNet. Using the sketch images from this dataset, the paper also aims at eval-
CAD
uating the performance of various retrieval system or a search engine for 3D CAD models that accepts a
Sketch
sketch image as the input query. Many experimental models are constructed and tested on CADSketch-
Net. These experiments, along with the model architecture, choice of similarity metrics are reported along
with the search results.
© 2021 Elsevier Ltd. All rights reserved.

1. Introduction engines for 3D CAD models, using a sketch image as the input
query. Using a sketch-based query for the search offers many ad-
The search or retrieval of Engineering (CAD) models is crucial vantages. 3D shapes, unlike text documents, are not easily re-
for a task such as design reuse [1]. Designers spend a significant trieved using textual annotations ([5]) since it is difficult to charac-
time searching for the right information and using a large percent- terize what human beings see and perceive using a mere text an-
age of existing design for a new product development [2]. [3] indi- notation. [6,7] show that content-based 3D shape retrieval meth-
cates that a large percentage (75% or greater) of design reuses ex- ods (those that use the visual/shape properties of the 3D models)
isting knowledge for the new product development. This calls for are shown to be more effective. It is also shown in [6,7] that us-
the search and classification of 3D Engineering models [4]. With ing traditional search methods for multimedia data will not yield
the wide applicability of 3D data and the increased capabilities of the desired results. [8] utilizes the idea of using the feature de-
modelling, digital archiving, and visualization tools, the problem of scriptors/vectors of the 3D model for the search query. Among the
searching or retrieving CAD models becomes a predominant one. available query options, a sketch-based query is shown to be very
The research work presented in this paper aims at develop- intuitive and convenient for the user [9–11], since it is easier for
ing a well-annotated sketch dataset of 3D Engineering CAD mod- the user to learn and use such a system over using a 3D model it-
els, that can aid in the development of deep learning-based search self as the query since it requires technical expertise and skill [12].
The Princeton Shape Benchmark (PSB) [13] was one of the ear-
lier 3D shape databases. Consequently, large-scale datasets such
as ShapeNet [14] came into being. Due to such data availability,

Corresponding author.
many machine learning-based techniques, which require a good
E-mail addresses: [email protected] (B. Manda), [email protected]
(S. Dhayarkar), [email protected] (S. Mitheran), [email protected] (V.K. amount of data to train the models, have been developed ([15,16]).
Viekash), [email protected] (R. Muthuganapathy). [17] provided the first benchmark dataset of sketches based on
1
Shubham Dhayarkar, Sai Mitheran, V.K. Viekash have contributed equally. the 3D models in PSB. As a result of this, [18] and [19] have

https://doi.org/10.1016/j.cag.2021.07.001
0097-8493/© 2021 Elsevier Ltd. All rights reserved.
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

3. To benchmark the developed dataset, we analyzed the


performance of various learning-based approaches for the
sketch-based retrieval of 3D CAD models.
4. The performance of the various experimental retrieval sys-
tems on CADSketchNet are compared, and the search results
are reported.

This paper makes useful contribution to the research commu-


nity involving ‘mechanical components’ and allow researchers to
develop new algorithms for the same. The paper is organized as
follows: Section 4.2.1 discusses the related works corresponding to
3D CAD models, in addition to the literature on images and generic
3D Shapes. The dataset preparation is explained in Section 3 in-
cluding the challenges involved in creating hand-drawn sketch data
Fig. 1. Distinction between a Graphical Model and a CAD Model for 3D CAD models, the need for computer-generated sketches and
the process of generating such sketches. The experiments done in
order to benchmark the developed ‘CADSketchNet’ dataset are de-
introduced large-scale benchmarks for sketch data for 3D shapes,
tailed in Section 4. Section 5 provides the Implementation Details.
including many approaches for sketch-based retrieval.
The results, limitations and possible future work are elaborated in
However, these datasets and methodologies only aim at generic
Section 6, followed by a Conclusion (Section 7).
3D shape data or graphical models and do not contain Engineering
CAD models. The presence of features such as holes (genus > 0),
2. Related works
blind holes (genus = 0) and other machining features in an Engi-
neering/CAD model (see Fig. 1a) calls for special treatment as op-
Many works in recent times have focused on 3D graphical mod-
posed to a 3D graphical model where such features and holes are
els and images. We focus more on the approaches that have been
usually absent (see Fig. 1b). Also, sharp edges are usually found in
proposed for the search and classification of 3D mechanical com-
a CAD Model unlike a graphical model that has smooth curvature
ponents, which are very few. However, a few of the recent ap-
throughout.
proaches used in other domains are also mentioned for the sake
In the case of Engineering models, the people involved are re-
of completeness.
quired to have rich domain knowledge and experience. CAD mod-
els are typically obtained via the design process, unlike 3D shape
data which are acquired via 3D scans. Since most design data 2.1. Images
are proprietary, they are not available to the public domain (e.g.
[20]). Moreover, among the few datasets that are available for us- Deep learning became ubiquitous for image-related tasks since
age, they either lack proper annotations [21] or contain too few the application of Convolutional Neural Networks (CNN) to the
models [22–24], which are not very conducive for employing deep ImageNet Challenge [29]. For the sketch-based retrieval of image
learning methods. Although the work presented in [25] performs data, sketch datasets such as [30] and [31] have been introduced.
a sketch-based retrieval of 3D CAD models, the dataset is pro- The work in [32] extracts edge maps of the matching and non-
prietary. Engineering shape benchmark (ESB) [26] is a prominent matching images and uses these maps for training. [33], [34] pop-
dataset for CAD models having 801 models. More recently, datasets ularize the zero-shot learning framework. [35] attempts an on-
such as mechanical components benchmark (MCB) [27] and CAD- the-fly sketch-based image retrieval using reinforcement learning
NET [28] have been proposed. Nonetheless, the data is in the form methodology.
of 3D models, and no sketch information is present. As far as
the sketch-data for Engineering models is concerned, there are no 2.2. 3D models of common objects
datasets available to the best of our knowledge.
Our motivation for addressing the problem of retrieval of Engi- [14] introduced the ShapeNet dataset and also performed clas-
neering / CAD models, therefore, comes from the following: sification and retrieval tasks on it. Many learning-based approaches
have been proposed for the tasks of classification and retrieval us-
1. Most CAD datasets are typically contain only a few models ing the ModelNet dataset. The leader-board can be found at [36],
and hence cannot be utilized by the latest technological ad- with the popular methods being [37–39]. However, none of them
vances such as deep neural networks. uses a sketch query. A few tracks of the Shape Retrieval Contest
2. Datasets having larger number of CAD models are either (SHREC) involved a sketch-based retrieval challenge for 3D shapes.
proprietary (not publicly available) [20] or lack classification SHREC’12 [40] involved a sketch-based retrieval task and intro-
information [21]. duced a benchmark dataset. Building upon the dataset by [17],
3. There is no existing database for sketch-data of 3D CAD SHREC’13 [18] included a Large Scale Sketch-Based 3D Shape Re-
models since obtaining hand-drawn sketches for CAD mod- trieval track with a larger dataset. [12] compares different methods
els is difficult. for sketch-based 3D shape retrieval. SHREC’14 [19] expanded upon
4. The recent advances in deep learning have not been made this further to an Extended Large Scale Sketch-Based 3D Shape Re-
use of, to the best of our knowledge. trieval track. SHREC’17 [41] involved a sketch-based retrieval task
of 3D indoor scenes.
The key contributions of the paper are:

1. Using the available 3D CAD models from [26] and [27], a 2.3. 3D CAD models of engineering shapes
large-scale dataset of computer-generated sketch data called
‘CADSketchNet’ is created. The SHREC track [42] presents a retrieval challenge using the
2. Additional hand-drawn sketches of CAD models are also in- Engineering Shape Benchmark (ESB) dataset [26]. [43] uses the
corporated into the dataset using the models available from idea from content-based image retrieval to the domain of 3D CAD
[26]. models. [44] performs a visual similarity-based retrieval using 2D

101
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

engineering drawings. This method converts 2D drawings to a 3.1. Challenges in creating a dataset of hand-drawn sketches
shape histogram and then applies the idea of spherical harmonics
to obtain a rotation-invariant shape descriptor. Minkowski distance Creating a hand-drawn sketch for a 3D mechanical component
was used to measure the similarity between feature representa- is much more challenging than sketching a generic 3D shape. This
tions. is because,
[45] uses the data from ShapeLab [46] and ESB to develop a
• It is difficult to capture the detailed information present in a
sketch-based 3D part retrieval system. This paper uses the idea
CAD model, such as the presence of holes and volumetric fea-
of classifier combinations to aid in the retrieval process. Engineer-
tures in a single sketch.
ing models are classified functionally and not visually, as opposed
• Multiple viewing directions can be chosen to draw the sketch.
to the image or 3D graphical data. Taking this into account, the
• The sketches need to be drawn by users with domain knowl-
extracted shape descriptors (Zernike Moments and Fourier Trans-
edge and experience. Gathering a set of users to contribute to
forms) are sent to a Support Vector Machine (SVM) classifier. A
building such a dataset is both time-consuming and expensive.
weighted combination of the classifier outputs is then used to es-
• Once the hand-drawn sketches are obtained, they need to be
timate the class or category of the input query, which is then com-
verified and validated for correctness and closeness to the input
pared against the classes of the database. This is one of the earlier
CAD model.
works that use learning-based methods for building a sketch-based
• Different users have different drawing styles, and consistency
retrieval system for CAD models. However, the sketch-data itself is
needs to be maintained across the hand-drawn sketches.
not available.
More recently, a sketch-based semantic retrieval of 3D CAD Due to these reasons, attempting to create a dataset of hand-
models is presented by [25]. The CAD models and their paramet- drawn sketches for a large number of 3D CAD models is a tedious
ric features (used in 3D modelling software) are taken. The pre- task.
processed sketch query is first vectorized and then passed onto
topology-based rules. An integrated similarity measurement strat- 3.2. Hand-drawn sketch data generation
egy is used to compute the similarity between the query sketch
and CAD models’ database. This research work uses a dataset of The ESB dataset [26] is a publicly available CAD database that
2148 CAD models and six corresponding views of each model. This is also well annotated. In the ESB, there are 801 3D CAD mod-
dataset is proprietary and is not available. Also, the method pre- els across 42 classes (excluding the models in the ‘Misc’ category).
sented here uses the classical rule-based approach over the latest Since, the ESB is a reasonably sized dataset, we attempt to obtain
advances in learning-based approaches. hand-drawn sketches for all 801 3D CAD models of the ESB.

3.2.1. Gathering users for obtaining hand-drawn sketches


2.4. Other related works Around 50 users with experience of CAD and engineering draw-
ing were selected in order to provide a hand-drawn sketch for each
[47] presents a detailed study on the state-of-the-art methods 3D CAD model in the ESB. The users are mostly engineering stu-
and the future of sketch-based modelling and interaction. However, dents pursuing undergraduate and postgraduate courses in design,
the discussion mainly focuses on sketch interpretation and on the with a few industry professionals. Because the goal is to create a
development and usage of interactive sketches. There is very lim- dataset that can be used to develop learning-based solutions, users
ited discussion on 3D engineering shape data. were asked to draw the sketches in their natural style so that the
[48] introduces the OpenSketch Dataset which contains anno- dataset might capture more variability. Furthermore, users were
tated sketches corresponding to various product designs. A detailed encouraged not to be overly precise and correct, because the learn-
study is done in order to understand the stroke time and pressure. ing algorithms that could use this dataset can capture input varia-
A taxonomy of lines is also provided and the strokes are labelled. tions while staying robust to noise.
However, the dataset contains only 107 sketches across 12 cate-
gories, which is not sufficient for developing learning based mod- 3.2.2. Obtaining the hand-drawn sketches
els. Each user was then shown the 3D object (digital) and is asked
The SPARE3D dataset [49] aims at understanding the spatial to draw a digital sketch on a hand-held tablet device. Since a 3D
reasoning behind line drawings using deep neural networks. While object can be viewed from many directions, the best viewing di-
the dataset uses 10,369 3D CAD models, it only contains line draw- rection for the sketch (i.e. that which covers the entire geometry
ings of 3D objects from 8 different isometric views and does not of the object) is determined by the user, based upon the domain
contain any hand-drawn sketches. ProSketch3D [50] consists 1500 knowledge and experience. The cases of potential ambiguity with
sketches of 3D models across 500 object categories taken from respect to the viewing direction are resolved by a majority vote
ShapeNet. All sketches corresponding to generic 3D shapes and not among the users.
3D mechanical components. As we only attempt to draw a single sketch for each 3D CAD
model in the ESB dataset, the number of sketches obtained and the
category information acquired are the same as that of the ESB. The
3. Dataset creation reader can refer to the paper on ESB [26] for more information on
the category information and the number of models in each class.
As discussed in Section 4.2.1, a database of 3D models and their
corresponding sketches are not available for the domain of CAD 3.2.3. Processing the hand-drawn sketches
models. [51] introduces a sketch-dataset for CAD models. However, Initial cleaning of these hand-drawn sketches was done by
it is not useful for a traditional search problem since the sketches sketch pre-processing using the software in Autodesk Sketchbook.
are based on the design workflow, i.e. based on ‘how’ a model is Consequently, the sketches were then validated for their correct-
designed, rather than the model shape and geometry. [25] uses ness and closeness (resemblance to the 3D object) by domain ex-
only a proprietary dataset. Therefore, we attempt to build a new perts from academia and the industry. These images and the class
sketch-dataset, termed ‘CADSketchNet’, using the 3D models from labels (the same label as the 3D model from ESB) are stored as
existing CAD databases. a database. Hence, the number of sketches is only of the order of

102
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

3.3.1. Converting 3D model to a 2D image


A representative 2D image of every 3D model in the MCB needs
to be obtained as a first step. A Python script is used to save the
image of the 3D model from a particular viewing direction. Ap-
plying the idea of LFD, we obtain 20 images for every 3D object.
Now it is needed to identify one representative image for each ob-
ject. For this task, a group of 70 volunteers with knowledge of CAD
were identified. The 3D CAD objects were split into batches. For
each batch, the above procedure was applied, and the one among
the resultant output images was chosen as the representative im-
age for the 3D object. Repeating this process for every batch, one
image corresponding to every 3D CAD model in the MCB was gen-
erated. The overall procedure is summarized in Algorithm 1.

Algorithm 1 Method to obtain a 2D image for a 3D model


Input Database of 3D CAD models
procedure 3dobj_to_img
Split the 3D models in the database into n batches
i=1
while i <= n do
for every model in the batch do
Apply LFD to obtain 20 images
Fig. 2. Sample Hand-drawn sketches from the developed ‘CADSketchNet’. Users identify 1 among 20 images
Output 2D representative image for every 3D CAD model

few hundreds. Nevertheless, this dataset is stored and will be made


available since it is challenging to obtain a real dataset of hand-
drawn sketches. This dataset could also be used by algorithms that
do not need large-scale training data. 3.3.2. Creating computer-generated sketch from the image
There is ample literature related to generating computer
sketches. Edge detection of images is a fundamental approach to
3.2.4. Analyzing the hand-drawn sketches obtaining the object boundaries. Hence, we first begin our exper-
During the course of obtaining user drawn sketches, it was ob- iments to generate computer sketches for the 3D CAD models in
served that simple object classes like Clips, Bolt-Like Parts, Nuts the MCB dataset with the popular edge detection methods.
and Discs were very easy to draw, and took very little time to ob- We first experiment with the Canny edge detection method
tain. Users encountered significant difficulty when they were re- [53], which essentially tries to identify the object boundaries
quired to sketch object categories with complex features, such as present in the image using first derivative methods to identify local
90-degree elbows, Motor Bodies, Rectangular Housing, and so on. image features. The method is applied to the 2D images of the CAD
For most other object categories, the users were able to draw the models obtained from Algorithm 1. In order to enhance the ob-
sketches with a manageable level of difficulty. tained edges, we combine the Canny edge detection process with
There is also a need to assess the quality of the sketches in a second approach that uses the idea of image blurring, coupled
terms of the correctness of each user’s choice of viewing direction. with boundary shading. By using Gaussian blurring, the internal
The concept of Light Field Descriptor (LFD) is proposed by [52], object features are suppressed. Thus, a weighted combination of
in which a 3D model is placed inside a regular dodecahedron and the Canny edge detection and the idea of Gaussian blurring is pro-
images are obtained by using each vertex as a viewing direction. posed for generating the computer sketches.
Using this idea, 20 view images are obtained for each 3D object. The reason for using these two techniques is as follows. Since
The obtained sketch is compared with each of the 20 view im- the original images are obtained from CAD mesh models, they con-
ages. The view image with the highest obtained similarity score is tain several mesh lines. To generate sketches that can capture the
noted, and the viewing direction is cross-checked with the user’s overall shape and geometry of the object, we need to use meth-
choice of viewing direction. In majority of the cases, the gathered ods capable of extracting the significant edges without paying too
users were able to accurately determine the ideal viewing direc- much attention to minute details of the object. Usage of the canny
tion for a model. In the few circumstances where this was not the filter detects prominent edges effectively by thresholding, even in
case, the user was requested to redraw the sketch. The quality of a noisy environment. The minute noisy details are easily removed
the produced hand-drawn sketch dataset is thus ensured. Figure 2 due to the presence of a Gaussian Filter (GF), and the required sig-
shows a few sample sketches. nal (prominent pixels) can be enhanced using the Canny edge de-
tector that uses non-maximum suppression against the noise, re-
3.3. Creating Computer-generated Sketch Data sulting in a well-defined output. Additionally, to enhance the out-
put, we operate only in gray-scale rather than RGB. The procedure
The Mechanical Components Benchmark (MCB) [27], contains is summarized in Algorithm 2.
58,696 3D CAD models across 68 classes. Hence, for our study, we The weight assigned for the Gaussian blurring is significantly
utilize the 3D CAD models from the MCB to prepare a dataset of less in order to avoid too much suppression of minute details and
sketch images. Deep learning methodologies are data-driven, and thus leading to loss of vital information. Hence, the output of the
it calls for a large amount of data. Since it is tough to obtain weighted scheme closely resembles the output of the Canny edge
hand-drawn sketches for such a large-sized dataset of complex 3D detection scheme. The outputs of both these methods, along with
mechanical components, we attempt to create computer-generated the output of the weighted scheme, are shown in Figure 3 for a
sketch images corresponding to each CAD model in the MCB. sample input.

103
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Fig. 3. (a) 2D image obtained from Algorithm 1 (b) Result of Canny edge detection (c) Result of Gaussian blurring (d) Result of weighted scheme of (b)&(c)

Algorithm 2 Method to generate computer sketch from image 800 seconds to generate a single sketch image on an NVIDIA
1080Ti GPU, owing to the complex neural network pipeline. This
Input Images of 3D CAD models from Algorithm 1
is not suitable for generating sketches for a large number of 3D
1: procedure img_to_sketch
models. On the other hand, while the performance of the proposed
2: I ← Read input image in RGB color space
weighted-canny method is close enough to the Neural Contours
3: G ← Convert image from RGB colorspace to Grayscale
method in most of these cases, the time taken by to generate one
4: IG ← Invert color values of all pixels in Grayscale
sketch from a 3D model is just one second (including converting
5: B ← Convolve non-uniform GF : kernel size k & SD σ
a 3D model to an image followed by generating a sketch), on the
6: IB ← Invert the blurred image
same hardware setup. This aids very much in generating sketches
7: Element-wise division of G & IB with scale = 256.0
for a large dataset of 3D models such as the MCB. Hence, the pro-
8: O1 ← Binary threshold the obtained image
posed method of weighted canny edge detection is chosen to effi-
9: O2 ← Canny Edge Detection of G
ciently generate all the computer sketches of the MCB dataset.
10: S ← Weighted average over O1 and O2
Output Computer-generated sketches of 3D CAD models It is important to note that the goal of creating this dataset is to
aid in the development of deep learning-based CAD model search
engines. Since the end-users of the search engine are humans, and
sketches drawn by an average human being are bound to have er-
In addition to the weighted Canny edge detection method men-
rors, the dataset needs to contain sketches that are not perfect.
tioned above, other popular edge detectors such as the Scharr,
Only then will the learning-based methodologies that make use
Prewitt, Sobel and Robert Cross operators are also experimented
of this dataset become robust to input noise and errors. Figure 4
with. A similar weighted scheme is applied to each of these meth-
shows, for two sample cases, the image of the 3D CAD model,
ods. The other state-of-the-art sketch generation methods such as
the hand-drawn sketch and the computer-generated sketch. It can
NeuralContours [54], PhotoSketch [55], and Context-aware tracing
be observed that while the computer-generated sketch contains
strategy (CATS) [56], are also experimented.
a lot of detail and bears a close resemblance to the input, the
sketch does not look realistic. On the other hand, the hand-drawn
3.3.3. Comparing hand-drawn and computer-generated sketches sketches provide a realistic database of query sketches that can
Creating hand-drawn sketches for a 3D CAD model is ex- be used to train a robust search engine. Clearly, a hand-drawn
tremely difficult, as established in Section 3.1, and since such sketch dataset is more important and valuable as compared to
sketch data is not available for the MCB, we cannot directly com- computer-generated sketch data. Nevertheless, in the absence of
pare the computer-generated and hand-drawn sketches. However, a standard large-scale benchmark dataset of hand-drawn sketches,
for the ESB dataset, we have obtained hand-drawn sketches (see the computer-generated sketch data generated for the MCB dataset
Section 3.2). Therefore, computer sketches are generated for the using Algorithm 2 is the best option available.
801 models in ESB, and these are compared with the correspond-
ing hand-drawn sketches.
Many sketch generation methods were experimented in 3.3.4. Analysis of the proposed sketch-generation method
Section 3.3.2. To find out which among these methods result in To further understand the complexity involved in generating
sketches that most closely resemble the hand-drawn ones, we at- the sketch data, the time taken by the proposed technique is com-
tempt to compare the computer generated sketch and the hand- puted for each class of the MCB. Since the MCB dataset is not
drawn sketch of each 3D CAD model in the ESB. Various state-of- class-balanced, i.e. the data is unevenly distributed across classes,
the-art similarity metrics are used, and the average similarity score we compare the average time for generating a single sketch of
across all models is obtained. A detailed comparison is reported a particular class. It is observed that complex object categories
in Table 1. From the Table, it is clear that the proposed weighted such as Helical Geared Motors, Castor, Turbine etc. take a signifi-
canny approach performs much better than the plain Canny edge cantly higher time compared to the other categories. Simple object
detection. The MSE value with and without the non-maximal sup- classes such as Convex Washer, Cylindrical Pin, Setscrew, Washer
pression (NMS) stage of the plain canny method are 1010.96 & Bolt etc., take negligible time.
1577.43 respectively, while that of weighted canny is 209.41. The As discussed earlier with respect to the hand-drawn sketches,
values indicate that canny without NMS performs poorly, and the we only attempt to generate a single sketch corresponding to ev-
weighted canny approach performs the best. ery 3D CAD model in the MCB dataset and do not change any-
It can also be seen that for most similarity metrics, NeuralCon- thing else. Hence, the number of sketches obtained are the same
tours [54] generates a sketch closest to the hand-drawn sketch. as the number of CAD models in MCB, i.e. 58696 sketches across
However, the time taken to generate one sketch through the other the 68 classes of the MCB. For detailed information regarding the
methods is much higher than the proposed sketch generation categories and the number of models in each class, the reader is
method in Section 3.3.2. The neural contours method takes around directed to the MCB paper [27].

104
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 1
Similarity results obtained by using various approaches for comparing the hand-drawn and the computer-generated sketches for various sketch-
generation methods on the 801 CAD models of the ESB dataset. ↑ indicates that greater value for the metric indicates higher similarity, while ↓
indicates the opposite. The plain-canny results reported here are with non-maximal suppression. The similarity measures used are PSNR - Peak
signal-to-noise ratio; MS-SSIM - Multi Scale Structural Similarity Index [57]; IE - Information Entropy; VIF - Visual Information Fidelity [58]; MSE
- Mean Squared Error; UQI - Universal image Quality Index [59];

Sketch-generation method PSNR ↑ MS-SSIM ↑ IE ↑ VIF ↑ MSE ↓ UQI ↑ Conversion time (Per image in sec) ↓

plain-canny 18.0834 0.5718 1.5248 0.0034 1010.9600 0.9874 0.0021


weighted-scharr 21.3913 0.6327 1.3649 0.0031 472.0136 0.9948 0.0581
weighted-prewitt 21.7143 0.6412 1.3904 0.0031 438.1824 0.9952 0.0547
weighted-roberts 20.4433 0.6169 1.3498 0.0031 587.1513 0.9935 0.0517
weighted-sobel 21.6066 0.6388 1.3555 0.0031 449.1815 0.9951 0.0498
neural-contours [54] 25.4318 0.9319 1.3425 0.5292 186.1659 0.9977 828.50
photosketch [55] 12.5434 0.4978 3.7558 0.0055 3620.2508 0.9367 9.0180
CATS [56] 16.2040 0.5428 1.5796 0.0035 1558.3710 0.9788 1.7810
weighted-canny (ours) 24.9429 0.8208 1.6737 0.0034 209.4152 0.9977 0.0190

Fig. 4. (a), (d) - Sample Images extracted from two random CAD models in ESB; (b), (e) Computer generated sketch data; (c),(f) - hand-drawn sketch data; Although the
computer-generated sketch has a lot of detail and resembles the input closely, the hand-drawn sketch database provides realistic query images that aid in training robust
retrieval systems

Fig. 5. Sample images from the developed ‘CADSketchNet’ Dataset-A: Computer-generated sketches.

3.4. Summary and ‘CADSketchNet’ Details the Dataset-B as well. Some sample hand-drawn sketches are
shown in Figure 2.
The dataset ‘CADSketchNet’ contains two subsets.
Dataset-A has all the images split into an 80-20 ratio for train-
• Dataset-A contains (1) one representative image for each 3D ing and testing, respectively. This split is as per the MCB [27].
CAD model in the MCB dataset (obtained using Algorithm 1) (2) Dataset-B contains no train-test split since the size of the dataset
one computer-generated sketch for each representative image is not as large as Dataset-A. Nonetheless, users who intend to use
in the MCB dataset (obtained using Algorithm 2). This results in this data can customize the train-test ratio as needed.
58,696 computer-generated sketches across 68 categories. Some
sample sketches generated by the proposed method are shown
in Figure 5. 4. Experiments
• Dataset-B contains 801 hand-drawn sketches, one for each 3D
CAD model in the ESB dataset across 42 categories (as de- In this section, we analyze the behaviour of a few learning algo-
scribed in Section 3.2). Since the 3D CAD model data is ob- rithms for 3D CAD model retrieval on the Dataset-A and Dataset-B
tained from the ESB, the same category information applies to of the CADSketchNet.

105
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 2
Results obtained by using various view-based approaches when trained using CADSketchNet - Dataset-A.

Methodology Training Time Precision Recall Retrieval Time mAP Top k-Accuracy %

MVCNN 7h 30m 0.932 0.889 3.73e-05 0.894 94.03


GVCNN 7h 32m 0.959 0.904 3.79e-05 0.853 90.11
RotationNet 10h 14m 0.947 0.868 3.78e-05 0.872 92.18
MVCNN-SA 7h 04m 0.941 0.903 3.69e-05 0.912 94.56

Fig. 6. A generic pipeline for a sketch-based 3D CAD model search engine. Input query - a sketch image drawn by the user. Pipeline I - input sketch pre-processing followed
by feature extraction. Pipeline II - extracts features of all the CAD models in the database and stores them as a bag of features. The output feature vector from I is compared
against the bag of features from II. Using similarity metrics, the object(s) that is(are) most similar to the input query is(are) retrieved.

4.1. Experiments on Dataset-A • MVCNN-SA - Uses an approach similar to MVCNN, but attempts
to assign relative importance to the input views by using an
Methods proposed in literature mainly use point cloud repre- additional self-attention network.
sentations ([15,16]), and Voxel-grid representations ([60]). These
network architectures take in point cloud inputs or graph inputs A train-test split ratio of 80%-20% is used on ‘CADSketchNet’,
etc., and not images. Since the sketch data is available in the form which is similar to that of the MCB. The results of each of these
of images, it is not possible to experiment with these architectures. methods are summarized in Section 6.
However, a few papers use view-based representations ([37], [61]).
Since, we are dealing mainly with the image representations of 3D 4.2. Experiments on Dataset-B
CAD models, only the view-based methods can be experimented
on the Dataset-A of ‘CADSketchNet’. Dataset-B in itself is not a large-scale dataset, and hence, not
The performances of four view-based learning architectures all deep network architectures can be trained on it. However, us-
are analysed: MVCNN [37], GVCNN [61], RotationNet [62], and ing the LFD images (1 3D CAD model = 20 images), the amount
MVCNN-SA [63]. For training each model, we use the code and the of available training data also increases. Hence, in addition to
default settings for hyper-parameters, as mentioned in the respec- the view-based techniques mentioned above, we also come up
tive papers. A short description of these papers is mentioned here with a few other rule-based and learning-based approaches and
for the sake of completeness and for a better understanding of the analyze their performance. The overall pipeline for performing a
techniques experimented: sketch-based search for 3D CAD models using deep learning can
be broadly described as follows:
• MVCNN - Uses two camera setups (12 views and 80 views) to
render 2D images from a 3D model. These views are passed to a 1. Preparing a dataset of 3D CAD Models and their correspond-
first CNN for extracting relevant features. The obtained features ing sketches suitable for training and testing a deep learning
are then pooled and fed into a second CNN to obtain a compact model (discussed in Section 3).
shape descriptor. 2. Extracting feature representations from the CAD model as
• GVCNN - The 2D views of a 3D model are generated followed well as the query sketch.
by a grouping of these views resulting in different clusters with 3. Developing the model architecture that can efficiently be
associated weights. GoogLeNet is used as the base architecture. trained using the extracted. representation(s) as input.
• RotationNet - Uses only a partial set (≥1) of the full multi- 4. Checking the similarity of the queried sketch and the 3D
view images of an object as input. For each input image, the CAD models either directly or via their feature representa-
CNN also outputs the best viewpoint along with the predicted tions.
class, since the network treats the view-images as latent vari- 5. Retrieving the top-ranked result(s) based on the metric(s) of
ables that are optimized in the training process. similarity used.

106
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 3
Results obtained by using various view-based approaches when trained using CADSketchNet - Dataset-B.

Methodology Training Time Precision Recall Retrieval Time mAP Top k-Accuracy %

MVCNN 5h 22m 0.949 0.902 3.78e-05 0.912 95.15


GVCNN 5h 25m 0.972 0.894 3.85e-05 0.886 92.90
RotationNet 8h 3m 0.968 0.871 3.97e-05 0.909 96.66
MVCNN-SA 4h 58m 0.9506 0.943 3.63e-05 0.947 96.23

Table 4
Results obtained by using various models when trained on CADSketchNet - Dataset-B. The Siamese Network architecture out-
performs other methods.

Methodology Training Time Precision Recall Retrieval Time mAP Top k-Accuracy %

HOG-HOG – 0.666 0.031 3.72e-05 0.383 40.00


HOG-AE 8h 20m 0.666 0.054 1.81e-05 0.526 51.25
HOG-StackedAE 4h 30m 0.666 0.020 2.04e-05 0.490 37.13
HOG-3DCNN 10h 55m 0.666 0.048 5.41e-05 0.458 41.21
Siamese Network (CNN-CNN) 1h 10m 0.970 0.784 2.88e-05 0.977 95.11

Table 5
Results obtained by using various CNN architectures for the sketch-based retrieval of 3D CAD models when trained on the
hand-drawn sketches of CADSketchNet (Dataset-B).

CNN Architecture Precision Recall Retrieval Time mAP Top k-Accuracy %

LeNet5 [68] 0.9222 0.6854 2.00e-05 0.6667 60.93


AlexNet [69] 0.8757 0.7746 2.22e-05 0.7901 72.11
VGG16 [70] 0.9112 0.6990 2.24e-05 0.6860 78.81
VGG16_BN [70] 0.8989 0.7921 2.12e-05 0.7498 74.32
VGG19 [70] 0.8889 0.8100 2.20e-05 0.9004 88.80
VGG19_BN [70] 0.9498 0.9100 2.60e-05 0.8905 83.42
Inceptionv3 [71] 0.9668 0.9310 2.90e-05 0.9457 90.31
DenseNet121 [72] 0.9579 0.9330 2.60e-05 0.9001 89.65
DenseNet161 [72] 0.9660 0.9320 2.85e-05 0.9220 91.90
DenseNet169 [72] 0.9410 0.9430 2.54e-05 0.9671 96.54
DenseNet201 [72] 0.9312 0.9256 2.78e-05 0.9256 95.73
Xception [73] 0.9790 0.9555 2.44e-05 0.9780 97.20
ResNet18 [74] 0.9849 0.9610 3.56e-05 0.9660 97.50
ResNet34 [74] 0.9516 0.9810 2.09e-05 0.9851 94.98
ResNet50 [74] 0.9257 0.9066 2.67e-05 0.9234 92.30
ResNet101 [74] 0.9445 0.9255 2.93e-05 0.9222 90.34
Resnet152 [74] 0.9753 0.8923 3.34e-05 0.8231 96.80
ResNeXt50 [75] 0.9407 0.9667 2.89e-05 0.9775 92.09
ResNeXt101 [75] 0.9041 0.9433 2.85e-05 0.9432 88.91

An overview of a generic sketch-based search engine for CAD region. Our Model-1 uses the HOG in both Pipelines I and II (see
models is shown in Fig. 6. In the following sub-sections, each step Fig. 6).
of the pipeline used is explained in greater detail. Pipeline I: The inputs to this pipeline are the sketch images
from the dataset. These sketches are passed to the HOG algorithm,
which generates the feature vectors for each image separately. The
4.2.1. Feature Extraction and Model Architecture Details
following configuration is used for the HOG algorithm after due ex-
This section discusses steps 2 & 3 of the pipeline, namely (1)
perimentation: No. of pixels per cell: (8,8); No. of cells per block:
extracting the feature representations of the query sketch and the
(1,1); Orientations: 8; Block Normalization: L2; Feature Vector Size:
3D CAD models, and (2) using the extracted representations to
1024∗ 1;
build an appropriate network architecture. These two steps go
Pipeline II: Using the idea of LFD, 20 images (256∗ 256) are ob-
hand-in-hand since the model architecture depends upon the di-
tained for each 3D CAD model, resulting in 801∗ 20 images. These
mensionality of the extracted features. If the extracted represen-
images are then forwarded to the HOG block, which has a simi-
tation is a feature vector (1D), a deep neural network (DNN) or
lar configuration as mentioned above, and the bag of features is
variational auto-encoders (VAE) can be used. If the extracted rep-
obtained.
resentations are images (2D), then a convolutional neural network
Model-2 : HOG-AE Model-2 uses the HOG pipeline as defined
(CNN), convolutional auto-encoder (CAE), or a Siamese Network
in Model-1 for Pipeline I and an auto-encoder for Pipeline II. The
(SN) are some possible options. In some other cases, the features
bag of features is obtained by training the auto-encoder (AE) on
are extracted by the neural network itself. The various methods ex-
the LFD images and then extracting the encoded representation
perimented by us are described in this section.
from the latent space of the AE. After various experiments, the
Model-1 : HOG-HOG Histogram of Oriented Gradients (HOG)
following architecture for the AE is obtained. Encoder details: 8
is a widely used feature descriptor in computer vision and image
conv layers with (3∗ 3) filter and (1,1) stride; Batch Normalization
processing. It differs from other feature extraction methods by ex-
(BN) is applied after every two conv layers; 2 dense layers; De-
tracting the edges’ gradient and orientation rather than extracting
coder details: 2 dense and 8 deconv layers; an up-sampling layer
the edges themselves. The input image is broken down into smaller
of size (2,2) is applied after every two deconv layers; Activation
localized regions, and for each region, the gradients and orienta-
function: LReLU with a negative slope of 0.01. This AE is trained for
tions are computed. Using these a Histogram is computed for each

107
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Fig. 7. Top 10 search results of the developed Siamese network model for some sample input queries. Models retrieved in red boxes are from a different category.

Table 6 5. Implementation Details


The top k-accuracy values of deep-learning architectures
that are pre-trained on ModelNet and tested on CADS-
ketchNet vs. when trained on the developed CADSketchNet 5.1. Coding framework and system configuration

Method ModelNet40 Dataset-A Dataset-B


For implementing our neural network models, we use Python3
MVCNN 75.83% 94.03% 95.15% with PyTorch, while Python3 and sklearn were used to implement
GVCNN 82.46% 90.11% 92.90% the HOG algorithm. OpenCV library is used for all implementa-
RotationNet 86.84% 92.18% 96.66%
tions. All the implementations are carried out on a system running
MVCNN-SA 85.94% 94.56% 96.23%
Ubuntu 18.04 Operating System. The system has an Intel Core i7-
8700K CPU with 64GB RAM and an NVIDIA RTX 2080Ti GPU with
12GB RAM.

30 epochs using the Adam Optimization algorithm. Learning rate: 5.2. Hyper-parameter Tuning, Loss function & Optimization
0.0 0 01; Loss: Mean Squared Error (MSE);
Model-3 : HOG-StackedAE Model-3 uses the HOG pipeline as Training a neural network is an arduous process because of
defined in Model-1 for Pipeline I and a stacked auto-encoder (SAE) the many decisions involved beforehand, such as choice of per-
for Pipeline II. Pipeline II is similar to the one defined in Model- formance metrics, hyper-parameters, loss function, etc. Our choices
2. Instead of passing all 801∗ 20 images as separate inputs, the 20 are mainly based on heuristics ([65–67]) and are backed by exper-
images of each 3D CAD model are stacked and sent as a single imental verification. The Weights & Biases (wandb) library is also
input. Changes from pipeline in Model-2: Number of epochs: 50; used to assist in hyper-parameter tuning.
Learning-rate: 3e-5;
Model-4 : HOG-3DCNN Model-4 uses the HOG pipeline as de- 6. Results and discussion
fined in Model-1 for Pipeline I and a 3D convolutional neural net-
work (3D-CNN) for Pipeline II. For Pipeline II, each 3D CAD model The results of all the Model experiments (see Section 4) are dis-
is passed through a 3D-CNN. The extracted representations from cussed here. Since there are no existing methods on the CADS-
the final dense layer are collected together as the bag of fea- ketchNet dataset, we use these experimental results to compare
tures. The architecture used for the 3D-CNN is 18 3D conv layers the performance of the best retrieval system. The results of search
with kernel size (1,1,1) and stride (1,1,1); max-pooling with kernel- or retrieval are subjective and cannot be precisely quantified. Nev-
size=2, stride=2 along with BN applied after every 2 conv layers; ertheless, to evaluate the retrieval system’s performance, we use
average pool layer with kernel-size=2, stride=2; two dense layers the standard metrics popularly used in literature. The ‘top k ac-
with a dropout 0.5. This network is trained for 120 epochs with curacy’ denotes how many of the k-retrieved classes match the
a learning rate of 0.0 0 01; LReLU activation; Loss: MSE; Optimizer: ground truth class. For instance, if 6 out of the top 20 retrieved re-
Adam; sults match the ground truth class, the accuracy is 30%. For all the
Model-5 : CNN-CNN / Siamese Network Model-5 uses a convo- reported results, we use k=10. We also calculate the precision and
lutional neural network (CNN) for both pipelines. This architecture recall values for the retrieval results. The mean Average Precision
is also known as Siamese Network, which implements the same (mAP), which is the area under the P-R curve, is also computed.
network architecture and weights for both pipelines. The network
is trained for 10 epochs; Learning Rate: 0.0 0 01; Optimizer: Adam; 6.1. Results of experiments on dataset-A
Batch Size: 2; Loss: SiameseLoss function as described in [64]; Ac-
tivation function: LeakyReLU with a negative slope = 0.01. The results of the four view-based methods experimented in
To ensure that a sufficient proportion of similar and dissimilar Section 4.1 are summarized in Table 2 along with the time taken
pairs are generated, an approach similar to that of [64] is used. for retrieval. Both the MVCNN and the MVCNN-SA methods yield
For each training sketch, a random number of view pairs (k p ) in similar performance when tested on Dataset-A, with MVCNN-SA
the same category and kn view samples from other categories (dis- performing slightly better in terms of lesser training time, lesser
similar pairs) are chosen. In the current experiment, the values time taken for retrieval and the top-k accuracy. Although Rota-
k p = 2, kn = 20 are used. This random pairing is done for each tionNet takes the longest time for training, it performs better than
training epoch. For increasing the number of training samples, data GVCNN. This could be because of the following reasons. (1) GVCNN
augmentation for the sketch set is also done. assigns relative weights for the input view images and groups the

108
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 7
Results obtained by the Siamese Network Architecture (Model-5) for the sketch-based retrieval of 3D CAD models when trained on the computer-
generated sketches of CADSketchNet (Dataset-A): Part 1 of 2

S No Class No.of Models Precision Recall Retrieval Time mAP Top k-Accuracy

1 Articulations, eyelets and other articulated joints 1632 0.992 0.815 4.10E-05 0.986 97.58
2 Bearing accessories 107 0.990 0.803 5.60e-05 0.985 98.11
3 Bushes 764 0.988 0.820 4.20e-05 0.985 97.99
4 Cap nuts 225 0.983 0.809 4.10e-05 0.989 97.84
5 Castle nuts 226 0.979 0.820 4.80e-05 0.988 97.82
6 Castor 99 0.983 0.821 4.60e-05 0.987 98.18
7 Chain drives 100 0.974 0.814 4.00e-05 0.988 98.17
8 Clamps 155 0.979 0.807 4.60e-05 0.989 98.21
9 Collars 52 0.986 0.835 5.10e-05 0.984 97.56
10 Conventional rivets 3806 0.983 0.819 3.70e-05 0.986 97.82
11 Convex washer 91 0.981 0.818 3.40e-05 0.987 98.03
12 Cylindrical pins 1895 0.982 0.814 3.30e-05 0.985 97.69
13 Elbow fitting 383 0.992 0.823 3.10e-05 0.987 97.41
14 Eye screws 1131 0.980 0.831 5.20e-05 0.984 98.16
15 Fan 213 0.995 0.817 3.00e-05 0.989 97.15
16 Flanged block bearing 404 0.988 0.815 2.90e-05 0.988 98.03
17 Flanged plain bearings 110 0.985 0.816 4.30e-05 0.988 98.50
18 Flange nut 53 0.991 0.808 5.60e-05 0.988 97.81
19 Grooved pins 2245 0.980 0.812 3.50e-05 0.987 97.85
20 Helical geared motors 732 0.989 0.825 4.70e-05 0.986 97.83
21 Hexagonal nuts 1039 0.991 0.815 3.20e-05 0.988 97.50
22 Hinge 54 0.976 0.815 5.60e-05 0.986 97.96
23 Hook 119 0.985 0.828 5.40e-05 0.990 98.17
24 Impeller 145 0.997 0.840 4.10e-05 0.989 97.06
25 Keys and keyways, splines 4936 0.993 0.818 6.10e-05 0.985 97.71
26 Knob 644 0.988 0.809 4.20e-05 0.984 97.72
27 Lever 1032 0.972 0.816 3.90e-05 0.987 97.66
28 Locating pins 55 0.992 0.820 4.70e-05 0.989 98.14
29 Locknuts 254 0.988 0.805 5.30e-05 0.991 97.75
30 Lockwashers 434 0.979 0.817 4.60e-05 0.986 98.04
31 Nozzle 154 0.988 0.820 4.40e-05 0.988 97.77
32 Plain guidings 49 0.980 0.811 4.70e-05 0.987 98.25
33 Plates, circulate plates 365 0.985 0.813 2.20e-05 0.985 97.67
34 Plugs 169 0.983 0.815 4.60e-05 0.983 98.03
35 Pulleys 121 0.976 0.830 5.10e-05 0.988 97.99
36 Radial contact ball bearings 1199 0.981 0.824 2.30e-05 0.988 97.83

views according to the assigned weights. It might be that the view- Model-2 and Model-3 use a similar approach to retrieval, with
ing direction of the sketch image does not match with that of the the differences being Model-2 using the view images of the CAD
group with the highest weight, thus slightly affecting the perfor- models separately, while Model-3 uses the view images together
mance. (2) RotationNet takes into account the alignment of the (stacked). The time taken to train Model-3 (see Table 4) is expect-
input objects. Since object orientations in the MCB Dataset, and edly lesser since Model-2 has to process 20 times more image data.
thereby the CADSketchNet-A, are aligned, the performance of the While the results from Model-2 appear quite satisfactory (in the
RotationNet model is enhanced. sense that the top-k accuracy is slightly greater than 50%), Model-3
severely under-performs. This might be because the network uses a
stacked set of view images, and in an attempt to capture much in-
6.2. Results of experiments on dataset-B
formation, the network overfits the data and thus under-performs.
Thus Model-2, which treats each view image separately, performs
The results of the four view-based methods experimented in
much better.
Section 4.2 are summarized in Table 3. Also, for the experimental
Model-4 uses a 3DCNN, which is computationally very inten-
models described in 4.2.1, the outputs from pipelines I and II are
sive. This is evident from Table 4, where Model-4 takes the high-
compared using a ‘similarity check’ block. For all the models (ex-
est time for training and searching. This is not conducive since,
cept Model-5), the Mean Squared Error (MSE) function is used to
in real-time scenarios, the search results need to be delivered to
calculate the similarity between the extracted features. For Model-
the user quickly. Also, the model’s performance is not satisfactory
5, a custom Siamese Loss function described in [64] is used to
since capturing the 3D CAD models directly leads to unnecessary
measure the similarity. These results and the time taken for re-
computations since most voxels in the input data would be empty
trieval are reported in Table 3.
(3D CAD models are typically sparse). This adversely affects the re-
trieval performance.
6.2.1. Discussion As it is evident from Table 4, using a Siamese network archi-
Model-1 uses a simple HOG-HOG pipeline and is unable to pro- tecture (CNN in both pipelines) for the search engine yields the
vide the user with relevant search results. Other models use deep best results among all methods that have been experimented, for
learning techniques and provide better results than the naive ap- all evaluation metrics. Some sample search results are shown in
proach used in Model-1 (except Model-3). Even the time taken for Figure 7. Also, the model takes much lesser training time com-
retrieval is relatively higher, considering the inexpensive computa- pared with other models. The search time depends a lot on the
tional nature of an algorithmic approach instead of a data-driven network architecture, the number of parameters to be trained and
approach.

109
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 8
Results obtained by the Siamese Network Architecture (Model-5) for the sketch-based retrieval of 3D CAD models when trained on the
computer-generated sketches of CADSketchNet (Dataset-A): Part 2 of 2

S No Class No.of Models Precision Recall Retrieval Time mAP Top k-Accuracy

37 Right angular gearings 60 0.984 0.829 5.10e-05 0.988 97.64


38 Right spur gears 430 0.991 0.815 3.80e-05 0.986 97.70
39 Rivet nut 51 0.991 0.831 4.70e-05 0.984 97.63
40 Roll pins 1597 0.990 0.814 4.20e-05 0.988 98.16
41 Screws and bolts with countersunk head 2452 0.990 0.836 5.20e-05 0.982 97.46
42 Screws and bolts with cylindrical head 3656 0.976 0.818 3.70e-05 0.988 98.06
43 Screws and bolts with hexagonal head 7058 0.983 0.819 6.10e-05 0.989 97.71
44 Setscrew 1334 0.986 0.839 3.80e-05 0.983 97.93
45 Slotted nuts 78 0.984 0.811 5.80e-05 0.989 98.03
46 Snap rings 609 0.978 0.839 4.10e-05 0.986 97.86
47 Socket 858 0.983 0.816 3.80e-05 0.992 97.78
48 Spacers 113 0.990 0.827 3.60e-05 0.987 98.03
49 Split pins 472 1.000 0.830 5.10e-05 0.985 97.53
50 Springs 328 0.986 0.819 5.30e-05 0.991 97.75
51 Spring washers 55 0.991 0.819 2.60e-05 0.984 97.52
52 Square 72 0.991 0.818 5.10e-05 0.987 97.68
53 Square nuts 53 0.987 0.821 4.40e-05 0.991 97.95
54 Standard fitting 764 0.990 0.815 4.80e-05 0.987 98.23
55 Studs 4089 0.983 0.822 5.90e-05 0.988 97.55
56 Switch 173 0.984 0.816 5.70e-05 0.987 97.51
57 Taper pins 1795 0.981 0.821 2.90e-05 0.987 98.15
58 Tapping screws 2182 0.978 0.815 4.70e-05 0.986 97.70
59 Threaded rods 1022 0.979 0.811 3.10e-05 0.987 97.99
60 Thrust washers 2333 0.992 0.835 3.20e-05 0.985 98.14
61 T-nut 101 0.980 0.823 4.40e-05 0.985 97.81
62 Toothed 47 0.989 0.815 5.30e-05 0.988 97.85
63 T-shape fitting 338 0.975 0.812 1.60e-05 0.987 98.03
64 Turbine 85 0.979 0.823 5.30e-05 0.986 97.63
65 Valve 94 0.981 0.815 3.30e-05 0.989 97.67
66 Washer bolt 912 0.982 0.837 4.80e-05 0.986 97.57
67 Wheel 243 0.979 0.826 4.20e-05 0.989 97.47
68 Wingnuts 50 0.992 0.816 5.40e-05 0.989 97.81
Overall 58696 0.985 0.820 4.34e-05 0.987 97.83

so forth. Also, the other methods use MSE for similarity measure- 6.3. Comparison with deep-learning approaches used for 3D
ment, while Model-5 uses a custom Siamese loss function which graphical models
takes longer to compute. Moreover, search time is only a secondary
measure of evaluating model performance. It is more important to The results mentioned in the preceding sections are for models
obtain a good performance as opposed to obtaining inaccurate re- trained on the created CADSketchNet dataset, which focuses on 3D
sults quickly. CAD models of mechanical components. The Section 1, Introduc-
Since Model-5 gives the best performance, various deep learn- tion, includes a description of how these CAD models differ from
ing architecture pipelines are used for the Siamese network and ordinary 3D shapes. In this section, we attempt to validate the as-
are analyzed for performance. These results are reported in Table 5. sertion by employing deep-learning models designed for 3D graph-
It can be seen that using a ResNet18 backbone yields the best re- ical models. We examine the performance of the same view-based
trieval accuracy, and ResNet34 backbone results in the best mAP strategies that were mentioned in Section 4.1.
value. We use the same pipeline summarized in Figure 6 for this
In addition, the class-wise results of only the Model-5 are re- purpose. Pipeline I receives no changes, whereas Pipeline II em-
ported in Table 7 & 8 when trained on Dataset-A. It is observed ploys the architecture that was pre-trained on ModelNet40 (a
that the Siamese model performs very well on the computer- 3D graphical models dataset). The performance of the aforemen-
generated sketch Dataset-A. The top k-accuracy value for every tioned approaches is then evaluated on the created CADSketchNet.
class in the MCB dataset is in the range 97.06% to 98.50%. MCB is a Table 6 summarizes these results. Comparing these, with the meth-
large dataset and has sufficient examples in each category. Hence, ods trained on 3D CAD model data (Tables 2 & 3), it can be inferred
the higher scores are justified. that, the methodologies established for 3D graphical models do not
Table 9 reports the class-wise results of the Siamese Model translate well to 3D CAD models of engineering shapes. Network
when trained on Dataset-B, i.e. the hand-drawn sketches, which architectures trained on a dataset specific to CAD models perform
has only 801 training samples. The results obtained differ signif- significantly better. Therefore, instead of attempting to generalise
icantly from those found for Dataset-A. The class ’Long Machine the usage of approaches intended at graphical models upon engi-
Elements,’ which comprises only 15 sketches, has the lowest ob- neering shapes, there is a need for developing dedicated datasets
tained accuracy rating of 86.67 percent. Considering the scarcity of and methodologies that explicitly focus on 3D CAD models.
training data, this value indicates a good performance. Some cate-
6.4. Limitations and possible future work
gories only have single-digit training examples, such as ‘BackDoors’
and ‘Clips’. Some of these classes obtain a 100% accuracy, but this
The scope of this work is confined to 3D Engineering CAD Mesh
is only because there are insufficient number of testing examples.
models. The current method does not retrieve other types of data,
For a majority of the other categories, where there are a signif-
such as images or 3D point sets. It is worthwhile to investigate the
icant number of examples available to train, the Siamese Model
possibility of creating a unified dataset with multiple input formats
performs quite well.
and developing a retrieval system that can search for multiple data

110
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Table 9
Results obtained by the Siamese Network Architecture (Model-5) for the sketch-based retrieval of 3D CAD models when
trained on the hand-drawn sketches of CADSketchNet (Dataset-B).

S No Class No.of Models Precision Recall Retrieval Time mAP Top k-Accuracy

1 90 degree elbows 41 0.964 0.853 2.21e-05 0.960 95.12


2 BackDoors 7 0.927 0.571 1.92e-05 0.957 100.00
3 Bearing Blocks 7 0.984 0.714 2.78e-05 0.967 100.00
4 Bearing Like Parts 20 0.991 0.900 2.83e-05 0.964 90.00
5 Bolt-like Parts 53 0.975 0.894 2.80e-05 0.966 92.45
6 Bracket-like Parts 18 0.972 0.778 2.68e-05 0.978 94.44
7 Clips 4 0.987 0.500 2.18e-05 0.987 100.00
8 Contact Switches 8 0.971 0.750 2.47e-05 0.971 87.50
9 Container-like Parts 10 0.981 0.700 2.76e-05 0.992 100.00
10 Contoured Surfaces 5 0.962 0.600 2.33e-05 0.986 100.00
11 Curved Housings 9 0.972 0.778 2.18e-04 0.972 100.00
12 Cylindrical Parts 43 0.980 0.930 2.88e-05 0.974 93.02
13 Discs 51 0.970 0.824 3.03e-05 0.977 94.12
14 Flange-like Parts 14 0.979 0.786 2.12e-05 0.978 92.86
15 Gear-like Parts 36 0.976 0.833 2.89e-05 0.985 97.22
16 Handles 18 0.963 0.889 1.92e-05 0.974 100.00
17 Intersecting Pipes 9 0.962 0.667 2.78e-05 0.956 100.00
18 L-Blocks 7 0.951 0.714 2.38e-05 0.979 100.00
19 Long Machine Elements 15 0.974 0.800 2.91e-05 0.978 86.67
20 Long Pins 58 0.969 0.927 2.63e-05 0.976 94.83
21 Machined Blocks 9 0.965 0.778 2.48e-05 0.966 100.00
22 Machined Plates 49 0.984 0.898 2.23e-05 0.986 91.84
23 Motor Bodies 19 0.983 0.842 1.97e-05 0.987 94.74
24 Non-90 degree elbows 8 0.973 0.875 2.61e-05 0.975 87.50
25 Nuts 19 0.971 0.737 1.99e-05 0.972 95.11
26 Oil Pans 8 0.970 0.750 2.68e-05 0.963 89.47
27 Posts 11 0.972 0.728 2.76e-05 0.984 90.91
28 Prismatic Stock 36 0.976 0.861 2.88e-05 0.977 94.44
29 Pulley-like Parts 12 0.974 0.833 2.98e-05 0.982 91.67
30 Rectangular Housings 7 0.968 0.571 2.77e-05 0.983 100.00
31 Rocker Arms 10 0.980 0.700 1.62e-05 0.991 100.00
32 Round Change At End 21 0.953 0.857 1.91e-05 0.967 95.24
33 Simple Pipes 16 0.956 0.875 2.21e-05 0.959 93.75
34 Slender Links 13 0.985 0.769 2.24e-05 0.987 100.00
35 Slender Thin Plates 12 0.971 0.750 2.13e-05 0.979 95.11
36 Small Machined Blocks 12 0.962 0.833 2.42e-05 0.985 100.00
37 Spoked Wheels 15 0.959 0.867 2.65e-05 0.978 93.33
38 T-shaped parts 15 0.954 0.800 2.48e-05 0.986 86.67
39 Thick Plates 23 0.965 0.826 1.99e-05 0.971 91.30
40 Thick Slotted plates 15 0.969 0.733 1.71e-05 0.987 93.33
41 Thin Plates 12 0.971 0.833 2.19e-05 0.995 100.00
42 U-shaped parts 25 0.972 0.800 1.82e-05 0.992 92.00
Overall 801 0.970 0.784 2.88e-05 0.977 95.11

As far as the hand-drawn sketches are concerned, since the


users are primarily from the design background, the obtained
sketches are in the same context. In future, gathering sketches
from the perspectives of other area experts, such as assembly or
machining, could be considered.
Another possible future work would be for people to contribute
to the dataset by providing hand-drawn sketches, which may be
used to increase the size of the dataset once it is made open. Re-
cent 3D CAD model datasets such as [28] can also be utilised to
generate more sketch images, and the dataset can be further in-
creased. The current work could also be extended to a CAD assem-
bly model retrieval [76].

Fig. 8. Illustrating some sample cases of improperly generated computer-sketch


data by the proposed method.
7. Conclusion

A sketch dataset of computer-generated query images, called


formats simultaneously. Also, only sketch query inputs are handled ‘CADSketchNet’, has been built using the available 3D CAD mod-
by the proposed model and not text query or 3D model query. els from the MCB dataset and the ESB dataset. These images, along
Some computer-generated sketches have unintended artifacts with each 3D CAD model’s representative images, are stored in a
caused by image processing algorithms and a few sketches miss database. Additionally, hand-drawn sketches corresponding to CAD
out on some features (Figure 8). While additional processing or models from the ESB dataset are also created and included in
other complex sketch generation techniques might potentially be ‘CADSketchNet’. This dataset will be made open-source and could
utilised for such sketches, it could considerably increase the time contribute to the development of image-based search engines for
required to gather large-scale data. 3D mechanical component CAD models - using the latest advances

111
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

in deep learning. The performance of standard view-based meth- [14] Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D shapenets: A deep
ods proposed in the literature is analyzed on CADSketchNet. Addi- representation for volumetric shapes. In: 2015 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR); 2015. p. 1912–20. doi:10.1109/CVPR.
tionally, the results of a few other experiments using popular deep 2015.7298801.
learning architectures are also reported. The possibilities of extend- [15] Qi CR, Su H, Mo K, Guibas LJ. Pointnet: Deep learning on point sets for 3d clas-
ing this research work to other similar problems have also been sification and segmentation. In: CVPR. IEEE Computer Society; 2017a. p. 77–85.
[16] Qi CR, Yi L, Su H, Guibas LJ. Pointnet++: Deep hierarchical feature learning on
discussed. point sets in a metric space. In: NIPS.
[17] Eitz M, Richter R, Boubekeur T, Hildebrand K, Alexa M. Sketch-based shape
Declaration of Competing Interest retrieval. ACM Trans Graph 2012a;31 31:1–31:10.
[18] Li B, Lu Y, Godil A, Schreck T, Aono M, Johan H, Saavedra JM, Tashiro S.
SHREC’13 Track: Large Scale Sketch-Based 3D Shape Retrieval. In: Castellani U,
The authors of this manuscript titled “ ‘CADSketchNet’ - An An- Schreck T, Biasotti S, Pratikakis I, Godil A, Veltkamp R, editors. Eurographics
notated Sketch dataset for 3D CAD Model Retrieval with Deep Neu- Workshop on 3D Object Retrieval. The Eurographics Association; 2013. ISBN
978-3-905674-44-6.
ral Networks agree to the submission of this manuscript to the
[19] Li B, Lu Y, Li C, Godil A, Schreck T, Aono M, Burtscher M, Fu H, Furuya T, Jo-
Computers and Graphics Journal as part of the 3DOR 2021 Con- han H, Liu J, Ohbuchi R, Tatsuma A, Zou C. Extended Large Scale Sketch-Based
ference. No author is affiliated with any financial organization or 3D Shape Retrieval. In: Bustos B, Tabia H, Vandeborre J-P, Veltkamp R, editors.
have received funding from sources. This paper is not currently be- Eurographics Workshop on 3D Object Retrieval. The Eurographics Association.
ISBN 978-3-905674-58-3.
ing considered for publication elsewhere. [20] Qin F-w, Li L-y, Gao S-m, Yang X-l, Chen X. A deep learning approach to
the classification of 3D CAD models. Journal of Zhejiang University SCIENCE
CRediT authorship contribution statement C 2014;15(2):91–106. doi:10.1631/jzus.C1300185.
[21] Koch S, Matveev A, Jiang Z, Williams F, Artemov A, Burnaev E, Alexa M,
Zorin D, Panozzo D. Abc: A big cad model dataset for geometric deep learning.
Bharadwaj Manda: Conceptualization, Methodology, Software, In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR);
Data curation, Writing – original draft, Visualization, Project ad- 2019.
[22] Ip CY, Regli WC, Sieger L, Shokoufandeh A. Automated learning of model clas-
ministration. Shubham Dhayarkar: Methodology, Software, Vali- sifications. In: Proceedings of the Eighth ACM Symposium on Solid Modeling
dation, Data curation. Sai Mitheran: Methodology, Software, Val- and Applications. New York, NY, USA: ACM; 2003. p. 322–7. ISBN 1-58113-706-
idation, Data curation. V.K. Viekash: Methodology, Software, Vali- 0. doi:10.1145/781606.781659.
[23] Wu MC, Jen SR. A neural network approach to the classification of 3D
dation, Data curation. Ramanathan Muthuganapathy: Conceptual- prismatic parts. Int J Adv Manuf Technol 1996;11(5):325–35. doi:10.1007/
ization, Resources, Writing – review & editing, Supervision. BF01845691.
[24] Bespalov D, Ip CY, Regli WC, Shaffer J. Benchmarking CAD search techniques.
In: Proceedings of the 2005 ACM Symposium on Solid and Physical Modeling.
Acknowledgments
New York, NY, USA: ACM; 2005. p. 275–86. ISBN 1-59593-015-9. doi:10.1145/
1060244.1060275.
Thanks are due to the teams of the ESB and the MCB datasets [25] Qin F, Gao S, Yang X, Bai J, hong Zhao Q. A sketch-based semantic retrieval
approach for 3d cad models. Appl Math-A J Chinese Univer 2017;32:27–52.
for making their data publicly available. Thanks are also due to
[26] Jayanti S, Kalyanaraman Y, Iyer N, Ramani K. Developing an engineering shape
many users who have contributed to our CADSketchNet dataset. benchmark for CAD models. Comput-Aided Des 2006;38(9):939–53. doi:10.
1016/j.cad.2006.06.007. Shape Similarity Detection and Search for CAD/CAE
References Applications
[27] Kim S, Chi H-g, Hu X, Huang Q, Ramani K. A large-scale annotated mechanical
[1] Bai J, Gao S, Tang W, Liu Y, Guo S. Design reuse oriented partial retrieval of components benchmark for classification and retrieval tasks with deep neural
cad models. Comput-Aided Des 2010;42(12):1069–84. doi:10.1016/j.cad.2010. networks. In: Proceedings of 16th European Conference on Computer Vision
07.002. (ECCV); 2020.
[2] Gunn TG. The mechanization of design and manufacturing. Sci Am [28] Manda B, Bhaskare P, Muthuganapathy R. A convolutional neural network ap-
1982;247(3):114–31. proach to the classification of engineering models. IEEE Access 2021. doi:10.
[3] Ullman DG. The mechanical design process, vol 2. McGraw-Hill New York; 1109/ACCESS.2021.3055826. 1–1
2010. https://www.davidullman.com/mechanical- design- process- 6ed [29] Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpa-
[4] Iyer N, Jayanti S, Lou K, Kalyanaraman Y, Ramani K. Three-dimensional thy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L. Imagenet large scale vi-
shape searching: state-of-the-art review and future trends. Comput-Aided Des sual recognition challenge. Int J Comput Vis 2015;115(3):211–52. doi:10.1007/
2005;37(5):509–30. doi:10.1016/j.cad.2004.07.002. Geometric Modeling and s11263- 015- 0816- y.
Processing 2004 [30] Sangkloy P, Burnell N, Ham C, Hays J. The sketchy database: Learning to
[5] Funkhouser T, Min P, Kazhdan M, Chen J, Halderman A, Dobkin D, Jacobs D. retrieve badly drawn bunnies. ACM Trans Graph 2016;35(4). doi:10.1145/
A search engine for 3d models. ACM Trans Graph 2003;22(1):83–105. doi:10. 2897824.2925954.
1145/588272.588279. [31] Eitz M, Hays J, Alexa M. How do humans sketch objects? ACM Trans Graph
[6] Tangelder JWH, Veltkamp RC. A survey of content based 3d shape retrieval 2012b;31(4). doi:10.1145/2185520.2185540.
methods. In: Proceedings Shape Modeling Applications, 20 04.; 20 04. p. 145– [32] Radenovic F, Tolias G, Chum O. Deep shape matching. In: Proceedings of the
56. doi:10.1109/SMI.2004.1314502. European Conference on Computer Vision (ECCV); 2018.
[7] Bustos B, Keim D, Saupe D, Schreck T. Content-based 3d object retrieval. IEEE [33] Dey S, Riba P, Dutta A, Llads JL, Song Y. Doodle to search: Practical zero-shot
Comput Graph Appl 2007;27(4):22–7. doi:10.1109/MCG.2007.80. sketch-based image retrieval. In: 2019 IEEE/CVF Conference on Computer Vi-
[8] Suzuki MT. A web-based retrieval system for 3d polygonal models. In: Pro- sion and Pattern Recognition (CVPR); 2019. p. 2174–83. doi:10.1109/CVPR.2019.
ceedings Joint 9th IFSA World Congress and 20th NAFIPS International Con- 00228.
ference (Cat. No. 01TH8569), vol. 4; 2001. p. 2271–6. doi:10.1109/NAFIPS.2001. [34] Yelamarthi SK, Reddy MK, Mishra A, Mittal A. A zero-shot framework for
944425. sketch-based image retrieval. In: ECCV; 2018.
[9] Aono M, Iwabuchi H. 3d shape retrieval from a 2d image as query. In: Pro- [35] Bhunia A, Yang Y, Hospedales TM, Xiang T, Song Y-Z. Sketch less for more:
ceedings of The 2012 Asia Pacific Signal and Information Processing Associa- On-the-fly fine-grained sketch-based image retrieval. 2020 IEEE/CVF Conf
tion Annual Summit and Conference; 2012. p. 1–10. Comput VisionPattern Recognit (CVPR) 2020:9776–85.
[10] Shin H, Igarashi T. Magic canvas: Interactive design of a 3-d scene proto- [36] Labs P.V..R.. Modelnet benchmark leaderboard. 2021. http://modelnet.cs.
type from freehand sketches. In: Proceedings of Graphics Interface 2007. New princeton.edu/.
York, NY, USA: Association for Computing Machinery; 2007. p. 63–70. ISBN [37] Su H, Maji S, Kalogerakis E, Learned-Miller EG. Multi-view convolutional neu-
9781568813370. doi:10.1145/1268517.1268530. ral networks for 3d shape recognition. 2015 IEEE Int Conf Comput Vis (ICCV)
[11] Lee J, Funkhouser T. Sketch-based search and composition of 3d models. In: 2015:945–53.
Proceedings of the Fifth Eurographics Conference on Sketch-Based Interfaces [38] Sfikas K, Theoharis T, Pratikakis I. Exploiting the PANORAMA representation
and Modeling. Goslar, DEU: Eurographics Association; 2008. p. 97–104. ISBN for convolutional neural network classification and retrieval. In: Pratikakis I,
9783905674071. Dupont F, Ovsjanikov M, editors. Eurographics Workshop on 3D Object Re-
[12] Li B, Lu Y, Godil A, Schreck T, Bustos B, Ferreira A, Furuya T, Fonseca MJ, trieval. The Eurographics Association; 2017. ISBN 978-3-03868-030-7.
Johan H, Matsuda T, Ohbuchi R, Pascoal PB, Saavedra JM. A comparison of [39] Maturana D, Scherer S. Voxnet: A 3d convolutional neural network for real-
methods for sketch-based 3d shape retrieval. Comput Vis Image Understand time object recognition. In: 2015 IEEE/RSJ International Conference on In-
2014a;119:57–80. doi:10.1016/j.cviu.2013.11.008. telligent Robots and Systems (IROS); 2015. p. 922–8. doi:10.1109/IROS.2015.
[13] Shilane P, Min P, Kazhdan M, Funkhouser T. The princeton shape benchmark. 7353481.
In: Proceedings Shape Modeling Applications, 20 04.; 20 04. p. 167–78. doi:10. [40] Li B, Schreck T, Godil A, Alexa M, Boubekeur T, Bustos B, Chen J, Eitz M,
1109/SMI.2004.1314504. Furuya T, Hildebrand K, Huang S, Johan H, Kuijper A, Ohbuchi R, Richter R,

112
B. Manda, S. Dhayarkar, S. Mitheran et al. Computers & Graphics 99 (2021) 100–113

Saavedra JM, Scherer M, Yanagimachi T, Yoon GJ, Yoon SM. SHREC’12 Track: [56] Huan L, Zheng X, Xue N, He W, Gong J, Xia G. Unmixing convolutional features
Sketch-Based 3D Shape Retrieval. In: Spagnuolo M, Bronstein M, Bronstein A, for crisp edge detection. CoRR 2020;abs/2011.09808.
Ferreira A, editors. Eurographics Workshop on 3D Object Retrieval. The Euro- [57] Wang Z, Simoncelli E, Bovik A. Multiscale structural similarity for image qual-
graphics Association; 2012. ISBN 978-3-905674-36-1. ity assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Sys-
[41] Hua B-S, Truong Q-T, Tran M-K, Pham Q-H, Kanezaki A, Lee T, Chiang H, tems Computers, 2003, 2; 2003. p. 1398–402. doi:10.1109/ACSSC.2003.1292216.
Hsu W, Li B, Lu Y, Johan H, Tashiro S, Aono M, Tran M-T, Pham V-K, Nguyen H- [58] Sheikh H, Bovik A. Image information and visual quality. IEEE Trans Image Pro-
D, Nguyen V-T, Tran Q-T, Phan TV, Truong B, Do MN, Duong A-D, Yu L-F, cess 2006;15(2):430–44. doi:10.1109/TIP.2005.859378.
Nguyen DT, Yeung S-K. RGB-D to CAD Retrieval with ObjectNN Dataset. In: [59] Wang Z, Bovik A. A universal image quality index. IEEE Signal Process Letts
Pratikakis I, Dupont F, Ovsjanikov M, editors. Eurographics Workshop on 3D 2002;9(3):81–4. doi:10.1109/97.995823.
Object Retrieval. The Eurographics Association; 2017. ISBN 978-3-03868-030- [60] Maturana D, Scherer S. Voxnet: A 3d convolutional neural network for re-
7. al-time object recognition. 2015 IEEE/RSJ Int Conf Intell Robot Syst (IROS)
[42] Muthuganapathy R, Ramani K. Shape retrieval contest 2008: Cad models. 2015:922–8.
In: 2008 IEEE International Conference on Shape Modeling and Applications; [61] Feng Y, Zhang Z, Zhao X, Ji R, Gao Y. Gvcnn: Group-view convolutional neural
2008. p. 221–2. doi:10.1109/SMI.2008.4547977. networks for 3d shape recognition. 2018 IEEE/CVF Conf Comput Vis Pattern
[43] Jain A, Muthuganapathy R, Ramani K. Content-based image retrieval using Recognit 2018:264–72.
shape and depth from an engineering database. In: Proceedings of the 3rd [62] Kanezaki A, Matsushita Y, Nishida Y. Rotationnet for joint object categoriza-
International Conference on Advances in Visual Computing - Volume Part II. tion and unsupervised pose estimation from multi-view images. IEEE Trans
Berlin, Heidelberg: Springer-Verlag; 2007. p. 255–64. ISBN 3540768556. Pattern Anal Mach Intell (TPAMI) 2021;43(1):269–83. doi:10.1109/TPAMI.2019.
[44] Pu J, Ramani K. On visual similarity based 2d drawing retrieval. Comput-Aided 2922640.
Des 2006;38(3):249–59. doi:10.1016/j.cad.2005.10.009. [63] Shajahan DA, Nayel V, Muthuganapathy R. Roof classification from 3-d lidar
[45] Hou S, Ramani K. Calligraphic interfaces: Classifier combination for sketch- point clouds using multiview cnn with self-attention. IEEE Geosci Remote Sens
based 3d part retrieval. Comput Graph 2007;31(4):598–609. doi:10.1016/j.cag. Lett 2020;17(8):1465–9. doi:10.1109/LGRS.2019.2945886.
20 07.04.0 05. [64] Fang Wang, Le Kang, Yi Li. Sketch-based 3d shape retrieval using convolutional
[46] Pu J, Ramani K. A 3d model retrieval method using 2d freehand sketches. neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern
In: Sunderam VS, van Albada GD, Sloot PMA, Dongarra JJ, editors. Compu- Recognition (CVPR); 2015. p. 1875–83. doi:10.1109/CVPR.2015.7298797.
tational Science – ICCS 2005. Berlin, Heidelberg: Springer Berlin Heidelberg; [65] Goodfellow I, Bengio Y, Courville A. Deep Learning, Chapter - Practical Method-
2005. p. 343–6. ISBN 978-3-540-32114-9. ology. MIT Press; 2016. http://www.deeplearningbook.org
[47] Bonnici A, Akman A, Calleja G, Camilleri KP, Fehling P, Ferreira A, Her- [66] Bengio Y. Practical recommendations for gradient-based training of deep ar-
muth F, Israel JH, Landwehr T, Liu J, et al. Sketch-based interaction and mod- chitectures. CoRR 2012;abs/1206.5533.
eling: where do we stand? Artif Intell Eng Design Anal Manuf: AI EDAM [67] Larochelle H, Bengio Y, Louradour J, Lamblin P. Exploring strategies for training
2019;33(4):370–88. deep neural networks. J Mach Learn Res 2009;10:1–40.
[48] Gryaditskaya Y, Sypesteyn M, Hoftijzer JW, Pont S, Durand F, Bousseau A. [68] Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to doc-
Opensketch: a richly-annotated dataset of product design sketches. ACM Trans ument recognition. Proc IEEE 1998;86(11):2278–324. doi:10.1109/5.726791.
Graphics (Proc SIGGRAPH Asia) 2019;38. [69] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep con-
[49] Han W, Xiang S, Liu C, Wang R, Feng C. Spare3d: A dataset for spatial rea- volutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ,
soning on three-view line drawings. In: 2020 IEEE/CVF Conference on Com- editors. Advances in Neural Information Processing Systems 25. Curran Asso-
puter Vision and Pattern Recognition (CVPR); 2020. p. 14678–87. doi:10.1109/ ciates, Inc.; 2012. p. 1097–105.
CVPR42600.2020.01470. [70] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale
[50] Zhong Y, Qi Y, Gryaditskaya Y, Zhang H, Song Y-Z. Towards practical sketch- image recognition. CoRR 2014;abs/1409.1556.
based 3d shape generation: The role of professional sketches. IEEE Trans Cir- [71] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception
cuit Syst Video Technol 2020. doi:10.1109/TCSVT.2020.3040900. 1–1 architecture for computer vision. In: Proceedings of the IEEE conference on
[51] Seff A, Ovadia Y, Zhou W, Adams R. Sketchgraphs: A large-scale dataset computer vision and pattern recognition; 2016. p. 2818–26.
for modeling relational geometry in computer-aided design. ArXiv [72] Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convo-
2020;abs/2007.08506. lutional networks. In: Proceedings of the IEEE conference on computer vision
[52] Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M. On visual similarity based and pattern recognition; 2017. p. 4700–8.
3D model retrieval. Comput Graphic Forum 2003;22(3):223–32. doi:10.1111/ [73] Chollet F. Xception: deep learning with depthwise separable convolutions
1467-8659.00669. (2016). arXiv preprint arXiv:161002357 2016.
[53] Canny J. A computational approach to edge detection. IEEE Trans Pattern Anal [74] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition.
Mach Intell 1986;PAMI-8(6):679–98. doi:10.1109/TPAMI.1986.4767851. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR);
[54] Liu D, Nabail M, Hertzmann A, Kalogerakis E. Neural contours: Learning to 2016. p. 770–8. doi:10.1109/CVPR.2016.90.
draw lines from 3d shapes. In: 2020 IEEE/CVF Conference on Computer Vision [75] Xie S, Girshick R, Dollr P, Tu Z, He K. Aggregated residual transformations for
and Pattern Recognition (CVPR); 2020. p. 5427–35. doi:10.1109/CVPR42600. deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pat-
2020.00547. tern Recognition (CVPR); 2017. p. 5987–95. doi:10.1109/CVPR.2017.634.
[55] Li M, Lin Z, Mech R, Yumer E, Ramanan D. Photo-sketching: Inferring contour [76] Han Z, Mo R, Yang H, Hao L. CAD assembly model retrieval based on
drawings from images. In: 2019 IEEE Winter Conference on Applications of multi-source semantics information and weighted bipartite graph. Comput Ind
Computer Vision (WACV); 2019. p. 1403–12. doi:10.1109/WACV.2019.00154. 2018;96:54–65. doi:10.1016/j.compind.2018.01.003.

113

You might also like