Skip to content
Snippets Groups Projects
CISI.QRY 66.6 KiB
Newer Older
Müller, Hanna's avatar
Müller, Hanna committed
1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525
.B
(Inform. Systems, Vol. 5, No. 4, May 1980, pp. 297-318)

.I 89
.T
Some Considerations Relating to the Cost-Effectiveness of Online Services
in Libraries
.A
Lancaster, F.W.
.W
    In 1978 Collier presented some hypothetical data on economic aspects
of the use of online services as compared with subscriptions to printed
services in libraries.  Collier's view of the economics of online searching
seems misleadingly pessimistic because:

1.  It looks only at costs but not at effectiveness in comparing the two
    modes of access and searching.  An analysis combining cost and
    effectiveness aspects (i.e., a cost-effectiveness analysis) would
    give a completely different picture.

2.  The way the cost data are presented is grossly unfair to the online
    mode of access and use.

This work contains corrected information regarding online and printed
services in libraries.
.B
(Aslib Proceedings, 33(1), January 1981, pp. 10-14, printed in Great Britain)

.I 90
.T
Co-Citation Context Analysis and the Structure of Paradigms
.A
Small, H.
.W
    Many information scientists are concerned with the operation of
document retrieval systems serving scientists in various fields.  The
scientists served by these systems are often members of what have been called
invisible colleges, groups of scientists in frequent communication with
one another and involved with highly specialized subject matters.  Often
such groups are considered to share an intellectual perspective regarding
this subject matter, which is sometimes referred to as a paradigm.

    The purpose of this paper is to show how it is possible to identify
paradigms, using the techniques of citation analysis.  I will operationalize
the notion of paradigm as a 'consensual structure of concepts in a field.'
Suppose we have obtained a set of papers pertaining to some topic.  Already
knowing something about the field, we read each text and mark passages in
which certain specific concepts are used or discussed.  For example, we
might find that a concept designated 'A' appears in some sub-set of the
papers.  Suppose further that we identify those papers in which concepts 'A'
and 'B' are used together in the same papers in a certain specified manner.
Clearly not all concepts will combine in a natural way, and not all authors
combining concepts 'A' and 'B' will do so in the same way, though some
predominant mode may emerge.  For a set of n concepts their structure is
given by the totality of admissible combinations of concepts taken from
two to n at a time.  The frequency with which a given combination occurs
in the sample of papers on the topic is a measure of the degree of consensus
regarding the particular concept combination within the corpus.  For
concepts taken two at a time, the structure can be displayed as a graph with
concepts as nodes and the relations between them represented as lines (arcs)
connecting the nodes.  This definition of concept structure is
similar to the semantic network of artificial intelligence except that in
our approach a measure of consensus weights each arc of the graph.
.B
(Journal of Documentation, Vol. 36, No. 3, September 1980, pp. 183-196)

.I 91
.T
Cocited Author Retrieval Online:
An Experiment with the Social Indicators Literature
.A
White, H.D.
.W
    One mode of online retrieval in Scisearch or Social Scisearch involves
entering pairs of authors' names believed to be jointly cited by
subsequent writers and retrieving papers in which cocitations occur.  Six
pairs were formed with the names of four authors prominent in the social
indicators movement (Bauer, Duncan, Land, and Sheldon).  Documents by the
four were not specified.  It was thought that the pair Duncan and Land
would retrieve papers in which indicator-type data would be integrated with
path-analytic causal modeling.  All other pairs seemed likely to retrieve a
"general social indicators" literature.  The 298 retrieved papers confirmed
expectattions.  It was found that 121 papers generally cited social indicators
(SI) documents by the input authors and frequently had SI language in
their titles.  Other signs of content also identified them as papers of
the SI movement.  The 177 papers retrieved on Duncan and Land generally
cited causal modeling documents by the input pair and were path-analytic
in nature.  As expected, they were relatively "harder" than the first
group of papers, although the two groups are akin and are formally linked
through citations in certain papers.  An additional result is that papers
citing at least three of the input authors tend to be overviews of the SI
movement.
.B
(JASIS, Vol. 32, No. 1, January 1981, pp. 16-21)

.I 92
.T
Database and Online Statistics for 1979
.A
Williams M.E.
.W
    The number of databases, records contained in databases and the online
use of databases has increased dramatically over the past several years,
bringing the 1979 totals for bibliographic, bioliographic-related, and
natural language databases to 528.  These 528 databases contain 148 million
records.  Some 4 million online searches were conducted via the major U.S.
and Canadian systems in 1979.
.B
(Bulletin of ASIS, Vol. 7, No. 2, December 1980, pp. 27-29)

.I 93
.T
Experiments in Local Metrical Feedback in Full-Text Retrieval Systems
.A
Attar, R.
Fraenkel, A.S.
.W
    A method of iterative searching, using the results of one iteration search
to formulate the next iteration search, was applied to a full-text database
consisting of some 2400 documents and 1,3000,000 text-words of Hebrew and
Aramaic.  The iterative method consists of clustering the documents returned
in an iteration, using weighting by proximity and by frequency simultaneously.
The process produces searchonyms, which are terms synonymous to keywords in the
context of a single query.  Augumenting or replacing keywords by searchonyms
via manual or automatic feedback leads to the formulation of the next iteration
search.  The results of the experiment are consistent with those of an earlier
small-scale experiment on an English database, and indicate that in contrast
to global clustering where the size of matrices limits applications to small
databases and improvements are doubtful, local metrical methods appear to be
well suited to arbitrarily large databases, improving precision and recall
simultaneously.  Further experiments using more test-queries run on even
larger databases should be made to collect further evidence as to the
performance of these methods.
.B
(Information Processing & Management, Vol. 17, No. 3, 1981, pp. 115-126)

.I 94
.T
A Microcomputer Alternative for Information Handling:  Refles
.A
Bivins, K.T.
Palmer, R.C.
.W
    REFLES is a microcomputer-based system for data retrieval in library
environments.  The problem of information retrieval is discussed from a
theoretical point of view, followed by an analysis of the reference process
and data thereby gathered, leading to a description of REFLES in terms of
its hardware and software.  REFLES, a prototype system at present, currently
functions in a test environment.  Examples of data contained in the system
and of its use are presented.  Future considerations and speculations on
other versions of the system conclude the paper.
.B
(Information Processing & Management, Vol. 17, No. 2, 1981, pp. 93-101)

.I 95
.T
A Comparison of Two Systems of Weighted Boolean Retrieval
.A
Bookstein, A.
.W
    A major deficiency of traditional Boolean systems is their inability to
represent the varying degrees to which a document may be written on a subject.
In this article we isolate a number of criteria that should be met by any
Boolean system generalized to have a weighting capability.  It is proven that
only one weighting rule satisfies these conditions--that associated with fuzzy-
set theory--and that this weighting scheme satisfies most of the other
properties associated with Boolean algebra as well.  Probabilistic weighting
is then introduced as an alternative approach and the two systems compared.
In the limit of zero/one weights, all systems considered converge to
traditional Boolean retrieval.
.B
(JASIS, Vol. 32, No. 4, July 1981)

.I 96
.T
Threshold Values and Boolean Retrieval Systems
.A
Buell, D.A.
Kraft, D.H.
.W
    Several papers have appeared that have analyzed recent developments
in the problem of processing, in a document retrieval system, queries expressed
as Boolean expressions.  The purpose of this paper is to continue that analysis.
We shall show that the concept of threshold values resolves the problems
inherent with relevance weights.  Moreover, we shall explore possible evaluation
mechanisms for retrieval of documents, based on fuzzy-set-theoretic
considerations.
.B
(Information Processing & Management, Vol. 17, No. 3, 1981, pp. 127-136)

.I 97
.T
A Model for a Weighted Retrieval System
.A
Buell, D.A.
Kraft, D.H.
.W
    There has been a good deal of work on information retrieval systems that
have continuous weights assigned to the index terms that describe the records
in the database, and/or to the query terms that describe the user queries.
Recent articles have analyzed retrieval systems with continuous weights of
either type and/or with a Boolean structure for the queries.  They have also
suggested criteria which such systems ought to satisfy and record evaluation
mechanisms which partially satisfy these criteria.  We offer a more careful
analysis, based on a generalization of the discrete weights.  We also look
at the weights from an entirely different approach involving thresholds, and
we generate an improved evaluation mechanism which seems to fulfill a larger
subset of the desired criteria than previous mechanisms.  This new mechanism
allows the user to attach a "threshold" to the query term.
.B
(JASIS, Vol. 32, No. 3, May 1981, pp. 211-216)

.I 98
.T
A Translating Computer Interface for End-User Operation of Heterogeneous
Retrieval Systems.  I. Design
.A
Marcus, R.S.
Reintjes, J.F.
.W
    Online retrieval systems may be difficult to use, especially by end
users, because of heterogeneity and complexity.  Investigations have concerned
the concept of a translating computer interface as a means to simplify access
to, and operation of, heterogeneous bibliographic retrieval systems and
databases.  The interface allows users to make requests in a common language.
These requests are translated by the interface into the appropriate commands
for whatever system is being interrogated.  System responses may also be
transformed by the interface into a common form before being given to the
users.  Thus, the network of different systems is made to look like a single
"virtual" system to the user.  The interface also provides instruction and
other search aids for the user.  The philosophy, design, and implementation
of an experimental interface named CONIT are described.
.B
(JASIS, Vol. 32, No. 4, July 1981, pp. 287-303)

.I 99
.T
A Translating Computer Interface for End-User Operation of
Heterogeneous Retrieval Systems.  II. Evaluations
.A
Marcus, R.S.
Reintjes, J.F.
.W
    The evaluation of the concept of a translating compuyter interface for
simplifying operation of multiple, heterogenous online bibliographic
retrieval systems has been undertaken.  An experimental retrieval system,
named CONIT, was built and tested under controlled conditions with
inexperienced end users.  A detailed analysis of the experimental usages
showed that users were able to master interface operation sufficiently well
to find relevant document references.  Success was attributed, in part,
to a simple command language, adequate online instruction, and a simplified
natural-language, keyword/stem approach to searching.  It is concluded that
operational interfaces of the type studied can provide for increased usability
of existing system in a cost effective manner, especially for searchers.
Furthermore, more advanced interfaces based on improved instruction and
automated search strategy techniques could further enhance retrieval
effectiveness for a wide class of users.
.B
(JASIS, Vol. 32, No. 4, July 1981, pp. 304-317)

.I 100
.T
The Interface Between Computerized Retrieval Systems and Micrographic
Retrieval Systems
.A
McMurdo, G.
.W
    This paper notes the benefits accruing from interaction between computerized
retrieval systems and micrographic retrieval systems.  It reviews current state
of automated micrographic retrieval technology.  The conclusion is that with a
combination of advances in communications technology, and sophisticated indexing
input from libraries and information scientists, the new generation of automated
micrographs devices may constitute the on-line document retrieval systems of the
future.
.B
(Journal of Information Science I, 1980, pp. 345-349)

.I 101
.T
Parallel Computations in Information Retrieval
.A
Salton, G.
Bergmark, D.
.W
    Conventional information retrieval processes are largely based on data
movement, pointer manipulations and integer arithmetic; more refined retrieval
algorithms may in addition benefit from substantial computational power.

    In the present study a number of parallel processing methods are described
that serve to enhance retrieval services.  In conventional retrieval
environments parallel list processing and parallel search facilities are of
greatest interest.  In more advanced systems, the use of array processors
also proves beneficial.  Various information retrieval processes are examined
and evidence is given to demonstrate the usefulness of parallel processing
and fast computational facilities in information retrieval.
.B
(In Lecture Notes in Computer Science, No. III, W. Handler, Ed., Springer
Verlag, Berlin-New York, 1981, pp. 328-342)

.I 102
.T
The Measurement of Term Importance in Automatic Indexing
.A
Salton, G.
Wu, H.
Yu, C.T.
.W
    The frequency characteristics of terms in the documents of a collection
have been used as indicators of term importance for content analysis and
indexing purposes.  In particular, very rare or very frequent terms are
normally believed to be less effective than medium-frequency terms.  Recently
automatic indexing theories have been devised that use not only the term
frequency characteristics but also the relevance properties of the terms.
The major term-weighting theories are first briefly reviewed.  The term
precision and term utility weights that are based on the occurrence
characteristics of the terms in the relevant, as opposed to the nonrelevant,
documents of a collection are then introduced.  Methods are suggested for
estimating the relevance properties of the terms based on their overall
occurrence characteristics in the collection.  Finally, experimental
evaluation results are shown comparing the weighting systems using the term
relevance properties with the more conventional frequency-based methodologies.
.B
(JASIS, Vol. 32, No. 3, May 1981, pp. 175-186)

.I 103
.T
NDX-100:  An Electronic Filing Machine for the Office of the Future
.A
Slonim, J.
MacRae, L.J.
Mennie, W.E.
Diamond, N.
.W
    This paper describes the design and implementation of an "electronic filing
machine," a machine which is capable of storing large numbers of "unstructured"
documents in such a way a particular document may be easily and quickly
retrieved.  A functional distributed architecture permits the implementation
of the system in a mixture of hardware and software.
.B
(Computer, Vol. 14, No. 5, May 1981, pp. 24-36)

.I 104
.T
The Selection of Good Search Terms
.A
van Rijsbergen, C.J.
Harper, D.J.
Porter, M.F.
.W
    This paper tackles the problem of how one might select further search terms,
using relevance feedback, given the search terms in the query.  These search
terms are extracted from a maximum spanning tree connecting all the terms in the
index term vocabulary.  A number of different spanning trees are generated from
a variety of association measures.  The retrieval effectiveness for the
different spanning trees is shown to be approximately the same.  Effectiveness
is measured in terms of precision and recall, and the retrieval tests are done
on three different test collections.
.B
(Information Processing & Management, Vol. 17, No. 2, 1981, pp. 77-91)

.I 105
.T
Indexing Consistency, Quality and Efficiency
.A
Rolling, L.
.W
    Indexing quality determines whether the information content of an indexed
document is accurately represented.  Indexing effectiveness measures whether
an indexed document is correctly retrieved every time it is relevant to a
query.  Measurement of these criteria is cumbersome and costly; data base
producers therefore prefer inter-indexer consistency as a measure of indexing
quality or effectiveness.  The present article assesses the validity of this
substitution in various environments.
.B
(Information Processing & Management, Vol. 17, No. 2, 1981, pp. 69-76)

.I 106
.T
Text Passage Retrieval Based on Colon Classification:  Retrieval Performance
.A
Shepherd, M.A.
.W
    A set of experiments was conducted to determine the suitability of the
Colon Classification as a foundation for the automated analysis, representation
and retrieval of primary information from the full text of documents.  Primary
information is that information embodied in the text of a document, as opposed
to secondary information which is generally in such forms as:  an abstract, a
table of contents, or an index.
    Full text databases were created in two subject areas and queries solicited
from specialists in each area.  An automated full text indexing system, along
with four automated passage retrieval systems, was created to test the various
features of the Colon Classification.  Two Boolean-based systems and one simple
word occurrence system were created in order to compare the retrieval results
against types of systems which are in more common use.  The systems' retrieval
performances were measured using recall and precision and the mean expected
search length reduction factors.
    Overall, it was found that the Colon Classification-based systems did not
perform significantly better than the other systems.
.B
(Journal of Documentation, Vol. 37, No. I, March 1981, pp. 25-35)

.I 107
.T
User-Responsive Subject Control in Bibliographic Retrieval Systems
.A
Tague, J.M.
.W
    A study was carried out of the relationship between the vocabulary of
user queries and the vocabulary of documents relevant to the queries, and
the value of adding to the document description record in a retrieval system
keywords from previous queries for which the document had proved useful.
Two test databases incorporating user query keywords were implemented at
the School of Library and Information Science, University of Western
Ontario.  Clustering of the documents via title and user keywords, a
statistical analysis of title-user keyword co-occurrences, and retrieval
tests were used to examine the effect of the added keywords.  Results
showed the impracticality of the procedure in an operational setting, but
indicated the value of analyses with sample data in the development and
maintenance of keyword dictionaries and thesauri.
.B
(Information Processing & Management, Vol. 17, No. 3, 1981, pp. 149-159)

.I 108
.T
A Program for Machine-Mediated Searching
.A
Toliver, D.
.W
    A technique of online instruction and assistance to bibliographic data
base searchers called Individualized Instruction for Data Access (IIDA) is
being developed by Drexel University.  IIDA assists searchers by providing
feedback based on real-time analysis while searches are being performed.
Extensive help facilities which draw on this analysis are available to
users.  Much of the project's experimental work, as described elsewhere,
is concerned with the process of searching and the behavior of searchers.
This paper will largely address itself to the project's computer system, which
is being developed by subcontract with the Franklin Institute's Science
Information Services.
.B
(Information Processing & Management, Vol. 17, No. 2, 1981, pp. 61-68)

.I 109
.T
Author Cocitation:  A Literature Measure of Intellectual Structure
.A
White, H.D.
Griffith, B.C.
.W
    It is shown that the mapping of a particular area of science, in this
case information science, can be done using authors as units of analysis and
the cocitations of pairs of authors as the variable that indicates their
"distances" from each other.  The analysis assumes that the more two authors
are cited together, the closer the relationship between them.  The raw data
are cocitation counts drawn online from Social Scisearch (Social Sciences
Citation Index) over the period 1972-1979.  GThe resulting map shows
(1) identifiable author groups (akin to "schools") of information science,
(2) locations of these groups with respect to each other, (3) the degree of
centrality and peripherality of authors within groups, (4) proximities of
authors within group and across group boundaries ("border authors" who seem
to connect various areas of research), and (5) positions of authors with
respect to the map's axes, which were arbitrarily set spanning the most
divergent groups in order to aid interpretation.  Cocitation analysis of
authors offers a new technique that might contribute to the understanding of
intellectual structure in the sciences and possibly in other areas to the
extent that those areas rely on serial publications.  The technique
establishes authors, as well as documents, as an effective unit in
analyzing subject specialties.
.B
(JASIS, Vol. 32, No. 3, May 1981, pp. 163-171)

.I 110
.T
Progress in Documentation.  Word Processing:
An Introduction and Appraisal
.A
Whitehead, J.
.W
    The "Office of the Future," "Office Technology," "Word Processing,"
"Electronic Mail," "Electronic Communications," "Convergence," "Information
Management."  These are all terms included in the current list of buzz words
used to describe current activities in the office technology area.  The high
level of investment in factories and plants and the ever-increasing fight to
improve productivity by automating the dull, routine jobs are usually quoted
and compared with the extremely low investment in improving and automating
the equally tedious routine jobs in the office environment; the investment
in the factory is quoted as being ten times greater per employee than in the
office.  This, however, is changing rapidly and investment on a large scale
is already taking place in manhy areas as present-day inflation bites hard,
forcing many companies and organizations to take a much closer look at their
office operations.
.B
(Journal of Documentation, Vol. 36, No. 4, December 1980, pp. 313-341)

.I 111
.T
Document Clustering Using an Inverted File Approach
.A
Willett, P.
.W
    An automated document clustering procedure is described which does not
require the use of an inter-document similarity matrix and which is independent
of the order in which the documents are processed.  The procedure makes use of
an initial set of clusters which is derived from certain of the terms in the
indexing vocabulary used to characterise the documents in the file.  The
retrieval effectiveness obtained using the clustered file is compared with that
obtained from serial searching and from use of the single-linkage clustering
method.
.B
(Journal of Information Science, 2, 1980, pp. 222-231)

.I 112
.T
A Fast Procedure for the Calculation of Similarity Coefficients in
in Automatic Classification
.A
Willett, P.
.W
    A fast algorithm is described for comparing the lists of terms representing
documents in automatic classification experiments.  The speed of the procedure
arises from the fact that all of the non-zero-valued coefficicents for a given
document are identified together, using an inverted file to the terms in the
document collection.  The complexity and running time of the algorithm are
compared with previously described procedures.
.B
(Information Processing & Management, Vol. 17, No. 2, 1981, pp. 53-60)